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Abstract 

We present a state-of-the-art report on visualization in astrophysics. We survey representative papers from both astrophysics 
and visualization and provide a taxonomy of existing approaches based on data analysis tasks. The approaches are classified 
based on five categories: data wrangling, data exploration, feature identification, object reconstruction, as well as education 
and outreach. Our unique contribution is to combine the diverse viewpoints from both astronomers and visualization experts to 
identify challenges and opportunities for visualization in astrophysics. The main goal is to provide a reference point to bring 


modern data analysis and visualization techniques to the rich datasets in astrophysics. 


1. Introduction 


Modern astronomers are recording an increasing amount of infor- 
mation for a larger number of astronomical objects and making 
more complex predictions about the nature of these objects and 
their evolution over cosmic time. Both successes are being driven 
by advances in experimental and computational infrastructure. As 
a result, the next generation of computations and surveys will put 
astronomers face to face with a “digital tsunami” of both simulated 
and observed data. These data present opportunities to make enor- 
mous strides in discovering more about our universe and state-of- 
the-art visualization methodologies. 


This state-of-the-art report serves as a starting point to bridge the 
knowledge gap between the astronomy and visualization communi- 
ties and catalyze research opportunities. Astronomy has a long and 
rich history as a visual science. Images of the cosmos have been 
used to build theories of physical phenomena for millennia. This 
history makes astronomy a natural area for fruitful collaborations 
between visualization and astronomy. A substantial fraction of pre- 
vious work at this scientific intersection has therefore focused on 
image reconstruction — generating the most precise representation 
from a series of images of a patch of the sky — typically using op- 
timizations and signal processing techniques. Advances in image 
reconstruction have enabled great breakthroughs in astronomy, in- 
cluding the recent imaging of a black hole [EAA* 19]. However, in 
this report, we focus on modern visualization techniques, which in- 
clude 3D rendering, interaction, uncertainty visualization, and new 
display platforms. This report, authored by experts in both astron- 


omy and visualization, will help visualization experts better under- 
stand the needs and opportunities of astronomical visualization, and 
provide a mechanism for astronomers to learn more about cutting- 
edge methods and research in visualization as applied to astronomy. 


Comparison with related surveys. Several studies have fo- 
cused on surveying visualization of astronomical data. Hassan 
et al. [HF11] surveyed scientific visualization in astronomy from 
1990 to 2010. They studied visualization approaches for N-body 
particle simulation data and spectral data cubes — two areas they 
identified as the most active fields. They classified research papers 
in these areas based on how astronomical data are stored (i.e., as 
points, splats, isosurfaces, or volumes) and which visualization 
techniques are used. They also discussed visualization workflows 
and public outreach, and reviewed existing softwares for astronom- 
ical visualization. 


Lipsa et al. [LLC* 12], on the other hand, took a broader view in 
surveying visualization for the physical sciences, which included 
astronomy and physics. For astronomy, the papers are classified 
based on the visualization challenges they tackle: multi-field visu- 
alization, feature detection, modeling and simulation, scalability, 
error/uncertainty visualization, and global/local visualization. 


Hassan et al. excelled at classifying papers based on data types 
and considering how different types of data could be visualized. 
Lipsa et al. focused more on visualization techniques. A data- 
centered classification is useful for researchers to explore diverse 
ways to visualize their data, whereas a technique-centered classi- 
fication can be useful for researchers who want to explore their 
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data using a particular visualization technique. Our survey aims to 
strike a balance between these two classification schemes and clas- 
sifies the papers primarily based on data tasks and secondarily on 
visualization techniques, thus allowing researchers to explore how 
they can best visualize the data at hand based on the analysis tasks 
they have in mind. We also utilize tertiary categories in topical ar- 
eas in astronomy for cross-references for the astronomy audience. 
Namely, we classify papers based on extragalactic, galactic, plan- 
etary, and solar astronomy. We further label each paper as dealing 
with simulated or observational astronomical data. 


To the best of our knowledge, no comprehensive survey of visu- 
alization in astronomy has been conducted since 2012. Advances 
in both astronomical data and visualization in the past decade 
present a need for an updated state-of-the-art report. In 2011, Has- 
san et al. identified six grand challenges for scientific visualiza- 
tion in astronomy in the era of peta-scale data. Our survey dis- 
cusses how the community has responded to these challenges in 
the past decade. The unique contribution of this survey is the 
cross-discipline discussion between visualization experts and as- 
tronomers via two workshops (a mini-workshop in April 2020 and 
an IEEE VIS workshop in October 2020), where researchers from 
both fields worked together in identifying progress, challenges, and 
opportunities in astronomical visualization. This survey aims to 
become a reference point for building connections and collabora- 
tions between two communities: data-rich, but technique-hungry, 
astronomers and data-hungry, but technique-rich, visualization ex- 
perts. We further discuss datasets in astronomy in need of new ap- 
proaches and methodologies, visualization techniques that have not 
been applied to astronomical datasets, and visualization techniques 
that can enhance the educational value of astronomical datasets. 


In Sect. 2 we define our primary, secondary, and tertiary cat- 
egories of approaches based on data analysis task, visualization 
technique, and topical area in astronomy, respectively. In Sect. 3, 
4, 5, 6, and 7 we discuss and group papers based on the primary 
categories of data wrangling, data exploration, feature identifica- 
tion, object reconstruction, education and outreach, respectively. In 
Sect. 8 we identify challenges and opportunities for astronomy vi- 
sualization. We provide a navigation tool of the surveyed papers in 
Sect. 9, and we summarize our conclusions in Sect. 10. 


To make the survey results more accessible and actionable to 
the research community, all papers surveyed, including associated 
metadata, can be explored online with a visual literature browser 
(https://tdavislab.github.io/astrovis-survis) devel- 
oped with the SurVis [BK W 16] framework. 


2. Literature Research Procedure and Classification 


We reviewed representative papers over the past 10 years (between 
2010 and 2020) in the fields of astronomy and visualization that 
contain strong visualization components for astronomical data. The 
annotation of each paper was guided primarily by a set of data anal- 
ysis tasks; secondarily by a set of visualization techniques; and fi- 
nally by a set of topical areas in astronomy. We view these three 
categories as being on equal footing and not necessarily hierarchi- 
cal. Instead, they are considered as orthogonal dimensions and pro- 
vide complementary viewpoints. We organize the literature accord- 
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Figure 1: A typology of task-driven (primary), technique-driven 
(secondary), and topic-driven (tertiary) categories used in this sur- 


vey paper. 


ing to these three categories to provide a means of navigation from 
task-driven, technique-driven, and topic-driven perspectives. 


The literature surveyed spans venues in visualization such as 
IEEE Transactions on Visualization and Computer Graphics, Com- 
puter Graphics Forum, and IEEE Computer Graphics and Applica- 
tions; and astronomy such as Astrophysical Journal and Astrophys- 
ical Journal Letters, Monthly Notices of the Royal Astronomical 
Society, Astronomy and Computing, .Astronomy (Dot Astronomy), 
ADASS Conference Series, PASP (Publications of the Astronomical 
Society of the Pacific), Research Notes of the AAS. We also discuss 
data types that include simulation data and observation data, with 
the latter encompassing both image data and tabular data. Fig. 1 
shows a typology of primary, secondary, and tertiary categories 
used in this survey. 


2.1. Task-Driven Categories: Data Analysis Tasks 


Our literature review allowed us to identify five primary categories 
of approaches based on data analysis tasks: 


° D Data wrangling, which transforms astronomy data into for- 
mats that are appropriate for general purpose visualization tools; 

e @) Data exploration, where users explore a dataset in an un- 
structured way to discover patterns of interest; 

° Q Feature identification, which visually guides the identifica- 
tion and extraction of features of interest; 

° Object reconstruction, which provides an informative vi- 
sual representation of an astronomical object; 

° Education and outreach, where astronomical data or data 
products are made accessible to the general public. 


In an on-going paradigm shift in scientific outreach, technolog- 
ical advances are enabling data-driven and interactive exploration 
of astronomical data in museums and science centers. Hence, we 
include “education and outreach” as a data analysis category. The 
word “feature” generally means a measurable piece of data that can 
be used for analysis, whereas the word “object” may be considered 
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as a “feature” with sharp and/or discontinuous contrast in a dimen- 
sion of scientific interest. Whether a specific aspect of a dataset 
is considered an “object” or a “feature” depends on the scientific 
question at hand. We separate object reconstruction from feature 
identification to be compatible with the literature, but we envision 
a future in which these entities are recognized as a continuum. 


2.2. Technique-Driven Categories: Visualization Techniques 


Our secondary categories of approaches are based on visualization 
techniques employed for astronomical data: 


° 2D/3D plots that encompass classic 2D/3D plots such as his- 

tograms, scatter plots, pie chars, pie, bar, and line plots; 

o {2D images that utilize image processing techniques to gen- 

erate images of astronomy data; 

5) 3D rendering that generates representations of 3D volumet- 
ic data of interest; 

° D Interactive visualization that includes techniques such as 

linked views, detail on demand, visual filtering, and querying; 

e (B) Dimensionality reduction that transforms data from a high- 

dimensional into a property-preserving low-dimensional space 

as part of the visualization pipeline; 

° Q Uncertainty visualization that improves our ability to rea- 

son about the data by communicating their certainties that arise 

due to randomness in data acquisition and processing; 

° (ea) New display platforms that communicate data via tech- 

niques such as data physicalization and virtual reality. 


C) 


4 
m 


Although dimensionality reduction can be used as a purely data 
analysis strategy for noise reduction, clustering, or downstream 
analysis, it also serves as an integrated part of the visualization 
pipeline to facilitate data exploration and understanding. In this sur- 
vey, we focus on the use of dimensionally reduction in the context 
of visualization. Dimensionality reduction and clustering may be 
both considered as data preprocessing techniques, but we choose 
to exclude clustering as a category as it is a generic class of tech- 
niques implicitly implemented within many toolboxes and does not 
typically represent a main innovation of the surveyed research. 


We highlight the new display platforms as a category based on 
our experiences and workshops held among a growing “visualiza- 
tion in astrophysics” community. We believe there is a strong mo- 
tivation for this research direction as the community as a whole is 
ready for the next stage of scientific discovery and science commu- 
nications enabled by new displays. 


We also acknowledge that there are additional ways to think 
about categories based on visualization techniques. For instance, 
scalable, multi-field, comparative, and time-dependent visualiza- 
tion are all categories mentioned in the 2012 survey of Lipsa et al. 
However, as technology has evolved over the past decade, certain 
visualization techniques (e.g., scalable and comparative visualiza- 
tion) have become commonplace and thus lack specificity. Time- 
dependent visualization (Sect. 8.5), in particular, the interplay be- 
tween spatial and temporal dimensions, will be crucial as more time 
series astronomy data become available in the near future. In this 
survey, we choose specific visualization techniques that capture the 
state of the art and lead to informative categorization. 


2.3. Topic-Driven Categories: Topical Areas in Astronomy 


Our tertiary categories are based upon the list of topics from 
the National Science Foundation (NSF) Astronomy & Astro- 
physics directorate. These categories are used as a cross-reference 
for an astrophysics audience. We also investigated a curated list 
of research topics in astronomy and astrophysics provided by 
the American Astronomical Society (AAS) (https://aas.org/ 
meetings/aas237/abstracts). We decided to work with the 
coarser classification from NSF since the AAS list is overly re- 
fined and specialized for the purposes of this survey. Our tertiary 
categories are: 


Extragalactic astronomy 

©) Galactic astronomy 

< Planetary astronomy 

`) Solar astronomy and astrophysics 


In addition, we have labeled each paper with two tags: 


e (©) Simulated astronomical data 
R) Observational astronomical data 


For readers unfamiliar with certain terminology in astronomy or 
astrophysics, we recommend the astrophysics glossaries from the 
National Aeronautics and Space Administration (NASA) (https: 
//science.nasa.gov/glossary/) or the LEVELS Knowledge 
Base on Extragalactic Astronomy and Cosmology (https: //ned. 
ipac.caltech.edu/level5/). Meanwhile, we try our best to 
describe relevant terms the first time they are introduced in the sur- 
vey. We would like to point out that even though certain terminol- 
ogy may appear to be rather straightforward, in some cases, defini- 
tions vary within the field, and thus some attention must be given to 
the precise work in question. For example, the term halo typically 
refers to overdensities in the dark matter but the exact boundary of 
a halo in a specific calculation may vary (e.g., [KPH13]). 


Overview. One of the main contributions of this paper is the 
classification of existing works, which are summarized in Sect. 3 
to Sect. 7. The methods of classification reflect the authors’ expe- 
rience that comes from several meetings with experts in the astro- 
nomical visualization community. For each surveyed paper, we use 
our best judgment to infer its primary and secondary categories, al- 
though such classification may not be perfect; many papers span 
multiple categories. The best way to explore our classification is 
to use the table for each section (from Table 1 to Table 5) as a 
roadmap. 


We acknowledge that many effective tools were actively used 
in astronomy research published prior to 2010. We emphasize that 
this paper is not a comprehensive catalog of all tools used in as- 
tronomy, nor does it include pre-2010 works. Rather, this paper 
surveys active areas of visualization research in astronomy as iden- 
tified in publications in the last decade (2010-2021). We also note 
that whereas “astronomy” has previously meant the cataloging of 
the positions and motions of objects in the sky, and “astrophysics” 
the physical understanding of those objects, in this survey, we con- 
sider “astronomy” and “astrophysics” to be synonymous since few 
astronomers make the above distinction. In fact, by “visualization 
in astrophysics”, we consider the intersection of visualization with 
astronomy, astrophysics, and space exploration. 
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Table 1: Classifying papers under data wrangling based on secondary and tertiary categories. Top row, from left to right: (primary cate- 
gory) Data wrangling; (secondary categories) 2D/3D plots, 2D images, 3D rendering, interactive visualization, dimensionality reduction, 
uncertainty visualization, and new display platforms; (tertiary categories) extragalactic, galactic, planetary, and solar astronomy; (tags) 


simulated, and observational data. 


3. Data Wrangling 


Data wrangling is the process of transforming raw data into forms 
that more effectively support downstream analysis [KHP* 11]. This 
process is an important step for astronomy visualization because 
raw simulation or observational data require significant wrangling 
into a suitable form for visualization tasks. In this section, we cat- 
egorize papers that present novel work in data wrangling for as- 
tronomy visualization. Many established tools are available for data 
wrangling across specific areas of astronomy, but a full survey of 
such tools is not within the scope of this survey. High-dimensional 
data abstractions such as data cubes are commonly used in astro- 
physical sciences and are often stored in the FITS format. Many 
of the papers placed in this category focus on transforming raw 
astrophysical data cubes into suitable data formats that can be in- 
gested into open-source visualization tools, such as Blender and 
Houdini. Others introduce new formats that can be used to support 
various tools for data representation and data analysis. Authors of 
data wrangling papers have often made significant efforts to intro- 
duce astronomers to the visualization pipelines using these tools. 
We further classify these papers using our secondary categoriza- 
tion on visualization techniques (Sect. 2.2). Table 1 presents an 
overview of our categorization of data wrangling papers. 


Using Blender to visualize astrophysics data. Blender [Ble02] is 
an open-source 3D graphics and visualization tool that supports a 
wide range of modeling, animation, and rendering functionality. A 
range of papers have discussed its usefulness for presenting astron- 
omy data, and described pipelines for transforming raw data into 
scientific visualizations. Kent [Ken13] demonstrated how Blender 
can be used to visualize galaxy catalogs, astronomical data cubes, 
and particle simulations. Taylor [Tay15] introduced FRELLED, a 
Python-based FITS viewer for exploring 3D spectral line data us- 
ing Blender that visualizes 3D volumetric data with arbitrary (non- 
Cartesian) coordinates [Tay17b] and is designed for real time and 


interactive content. Using this viewer, astronomers are able to speed 
up visual cataloging by as much as 50x. Gárate [Gar17] described 
the process of importing simulation outputs from astrophysical hy- 
drodynamic experiments into Blender using the voxel data format. 
In order to facilitate immersive data exploration, Kent [Ken17] pre- 
sented a technique for creating 360° spherical panoramas using 
Blender and Google Spatial Media module. The method supports 
static spherical panoramas, single pass fly-throughs, and orbit fly- 
overs on browsers or mobile operating systems. 


AstroBlend [Nail2,Nail6] extends Blender, making it possible to 
import and display various types of astronomical data interactively, 
see Fig. 2. AstroBlend is an open-source Python library that utilizes 
yt — an open-source software for analyzing and visualizing volumet- 
ric data — for 3D data visualization (yt is discussed in Sect. 4). As- 
troBlend effectively bridges the gap between “exploratory” and “ex- 
planatory” visualization, as discussed by Goodman et al. [GBR18] 
and Ynnerman et al. [YLT18]. 


Using Houdini to visualize astrophysics data. In another example 
of adapting existing 3D graphics software, Naimen et al. [NBC17] 
explored how the 3D procedural animation software Houdini can 
be used for astronomy visualization, producing high-quality vol- 
ume renderings for a variety of data types. They utilized yt to 
transform astronomical data into graphics data formats for Houdini, 
which bridges the astronomical and graphics community. Houdini 
is a compelling alternative to other rendering software (e.g., Maya 
and Blender) for astronomy because it produces high-quality vol- 
ume renderings and supports a variety of data types. 


Borkiewicz et al. [BNL19] presented a method for creating cin- 
ematic visualizations and time-evolving representations of astron- 
omy data that are both educational and aesthetically pleasing. The 
paper also provided a detailed workflow of importing nested, multi- 
resolution adaptive mesh refinement data into Houdini. 
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Using ParaView to visualize astrophysics data. ParaView is an 
open-source, general-purpose, multi-platform analysis and visu- 
alization tool for scientific datasets. It supports scripting (with 
Python), web-based visualization, and in situ analysis (using Cat- 
alyst). Woodring et al. [WHA*11] used ParaView to analyze and 
visualize large N-body cosmological simulations. N-body cosmo- 
logical simulations are simulations of large-scale structures that 
contain particles that interact only via gravity, in contrast to in- 
cluding gas, which also requires hydrodynamics. ParaView pro- 
vides particle readers (supporting “cosmo” and “GADGET” for- 
mats) and efficient halo finders, where a halo is a gravitationally 
bound structure on galactic scales. Together with existing visual- 
ization features, ParaView enables efficient and interactive visual- 
ization of large-scale cosmological simulations. Recent work from 
the IEEE VIS 2019 SciVis content [NNPD19] used ParaView to 
visualize HACC (Hardware/Hybrid Accelerated Cosmology Code) 
cosmological simulations [HPF* 16]. 


Data wrangling to support visualization. Beyond the integration 
of visualization techniques into popular 3D software platforms, a 
range of projects have explored the transformation of astrophysi- 
cal data into formats suitable for different forms of presentation, 
immersion, and analysis. Data wrangling is a perennial concern, 
and as new display formats are introduced or made more widely 
accessible, researchers investigate how best to target them. For ex- 
ample, predating our survey, Barnes et al. [BFBPO6] introduced 
S2PLOT, a 3D plotting library for astronomy that supports dynamic 
geometry and time-varying datasets. S2PLOT has been used to con- 
struct models of planetary systems and create outputs for viewing 
on stereoscopic displays and in digital domes [FBO06]. Barnes and 
Flute [BF08] described a technique to embed interactive figures 
created with S2PLOT into Adobe PDF files to augment astronomy 
research papers, including 3D renderings of cosmological simula- 
tions and 3D models of astronomy instrumentation. 


Some earlier approaches to data wrangling continue to be use- 
ful for more contemporary projects. The Montage Image Mosaic 
Engine [Arc05] enables users to stitch a “mosaic” together from 
sets of individual FITS images, and supports a range of image 
manipulation functionality, such as pixel sampling, image projec- 
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Figure 2: A screenshot from a visualization session in AstroBlend, 
a Blender-based 3D rendering and analysis tool. Image reproduced 
from Naiman et al. [Nail6]. 


tion/rotation, background rectification, and animation. Montage can 
be used to create sky coverage maps and animations of data cubes, 
and its data wrangling capabilities have been integrated into other 
visualization tools. For example, mViewer, which can be scripted 
using Python, creates multi-color JPEG and PNG representations 
of FITS images and provides a wide range of functionality to sup- 
port various types of image overlays, such as coordinate displays, 
labels, and observation footprints [BG17]. 


Vogt et al. [(VOVMB16] introduced the X3D pathway for im- 
proving access to data visualization by promoting the use of inter- 
active 3D astrophysics diagrams based on the X3D format, which 
can be shared online or incorporated into online publications. Vogt 
et al. [VSDR17] demonstrated the potential of this “pathway” by 
interactively visualizing integral field spectrographs observed in a 
young supernova remnant in the Small Magellanic Cloud. First, 
they created an interactive diagram of a reconstructed 3D map of 
the O-rich ejecta and exported it to the X3D file format. Second, 
they utilized (and extended) the visualization tools provided by 
X3D to make the diagram interactive, such as the ability to toggle 
views, “peel” intensity layers to focus on particular ranges of data, 
and modify clip planes to slice the 3D model at certain locations or 
angles. 


Although the most common format for distributing astronomy 
images is FITS [WG79], Comrie et al. [CPST20] suggested that the 
HDF5 format [FHK* 11] is better suited for hierarchical data and for 
facilitating efficient visualizations of large data cubes. They iden- 
tified various common visualization tasks, including the rendering 
of 2D slices; generating single-pixel profiles, region profiles, and 
statistics; and interactive panning and zooming, and introduced a 
HDFS5 hierarchical data schema to store precomputed data to facili- 
tate these tasks. After integrating the HDF5 schema with the image 
viewer CARTA [OC20], they demonstrated that their schema was 
able to obtain up to 10° speed-ups for certain tasks. For example, 
precomputing and storing a dataset of histograms for each chan- 
nel of a Stokes cube enables CARTA to display the histograms for 
an entire data cube with minimal delay. CARTA is part of CASA — 
the Common Astronomy Software Applications package — a pri- 
mary data processing software for radio telescopes, including the 
Atacama Large Millimeter/submillimeter Array (ALMA) and the 
Karl G. Jansky Very Large Array (VLA). CASA [Jae08] supports 
data formats from ALMA and VLA, and is equipped with func- 
tionalities such as automatic flagging of bad data, data calibration, 
and image manipulation. It has also been used to simulate observa- 
tions. It comes with a graphic user interfaces with viewer, plotter, 
logger, and table browser [Jae08]. CASA has some recent develop- 
ments that enhance user experience [ERG19], including increased 
flexibility in Python and data visualization with CARTA. 


Vogt and Wagner advocated for the use of stereoscopy visualiza- 
tion, or “stereo pairs”, to enhance the perception of depth in multi- 
dimensional astrophysics data [VW12]. Their technique involves 
sending distinct images to each eye, and supports both parallel and 
cross-eyed viewing techniques. They described a straightforward 
method to construct stereo pairs from data cubes using Python, and 
used various examples of both observational and theoretical data to 
demonstrate the potential of stereoscopy for visualizing astrophys- 
ical datasets. 
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Verbraeck and Eisemann [VE21] presented a technique for inter- 
actively rendering black holes (see Fig. 3), illustrating how a black 
hole creates spacetime distortions in its environment due to gravi- 
tational lensing and redshift. The rendering algorithm first creates 
an adaptive grid that maps a uniform 360-view surrounding a vir- 
tual observer to the distorted view created by the black hole. This 
mapping is then used to optimize ray tracing through curved space- 
time. The rendering solution also includes an interpolation tech- 
nique that simulates the movement of the observer around the black 
hole, enabling interactive transitions between multiple sets of adap- 
tive grids. 


Figure 3: The projection of the distorted celestial sky caused by 
a Kerr black hole. Image reproduced from Verbraeck and Eise- 
mann [VE21]. 


Data wrangling will continue to be an important component of 
astrophysics research as new sensors, telescopes, and other space 
instruments are built that generate datasets at higher resolutions and 
consisting of new data types. New data transformation methods or 
modifications of existing methods will be required to interoperate 
with existing visualization tools and to expand the accessibility of 
the data, making the data available in forms suitable for presenta- 
tion, collaboration, interactive analysis, and public outreach. 


4. Data Exploration 


In this section, we summarize research efforts that use visualization 
to focus on exploratory data analysis [Tuk77]. Broadly speaking, 
the defining attribute of data exploration papers is a focus on facili- 
tating the unstructured investigation of a dataset in order to discover 
patterns of interest and formulate hypotheses. Our interpretation of 
data exploration follows Goodman’s perspective on studying high- 
dimensional data in astronomy, where “interactive exploratory data 
visualization can give far more insight than an approach where data 
processing and statistical analysis are followed, rather than accom- 
panied, by visualization.” [Goo12]. We distinguish between “het- 
erogeneous” and “hierarchical” data exploration to highlight the 
different methodologies employed, where heterogeneous refers to 
drawing together disparate datasets and hierarchical refers to a deep 
exploration of fundamentally similar datasets (perhaps at different 
resolutions). Table 2 presents an overview of our categorization of 
data exploration papers. 


4.1. Heterogeneous Data Exploration 


A number of astrophysics visualization software frameworks and 
tools have emphasized the value of exploring multiple datasets si- 
multaneously in order to generate new insight, often requiring (or 
facilitating) data transformation pre-processing steps. 


yt [TSO*10] is an open-source, flexible, and multi-code data 
analysis and visualization tool for astrophysics. Earlier versions 
of yt focused on making it possible to examine slices and pro- 
jected regions within deeply nested adaptive mesh refinement sim- 
ulations [BNO* 14]. Although still widely used for its data wran- 
gling capabilities, yt now also includes a range of data exploration 
and feature identification functionalities, providing off-screen ren- 
dering, interactive plotting capabilities, and scripting interfaces. 
It efficiently processes large and diverse astrophysics data, cre- 
ates 2D visualization with an adaptive projection process and vol- 
ume rendering by a direct ray casting method. Its cross-code sup- 
port enables analysis for heterogeneous data types, and facilitates 
cross-platform collaborations between different astrophysics com- 
munities. In order to reduce processing time, yt adopts parallelism 
and is able to run multiple independent processing units on a sin- 
gle dataset in parallel. Apart from being easily customizable, yt 
presents a number of pre-defined analysis modules for halo find- 
ing, halo analysis, merger tree creation, and time series analysis, 
among others, and a recent project makes it possible to use yt for 
interactive data exploration within Jupyter notebooks [MT20]. yt is 
also notable for its large, active community of users and developers. 


Filtergraph [BSP* 13] is a web application that generates a range 
of 2D and 3D figures. It is designed to reduce the “activation en- 
ergy” of the visualization process to flexibly and rapidly visualize 
large and complex astronomy datasets. It accepts numerous file for- 
mats without meta-data specifications, from text files to FITS im- 
ages to Numpy files. The interface enables users to plot their data 
as high-dimensional scatter plots, histograms, and tables. Users can 
extensively explore the datasets and switch between different repre- 
sentations without cognitive interruption. Users can also customize 
the visualization through various interactive capabilities, such as 
panning, zooming, data querying, and filtering. Filtergraph also fa- 
cilitates the sharing and collaboration of visualizations. 


Luciani et al. [LCO* 14] introduced a web-based computing in- 
frastructure that supports the visual integration and efficient mining 
of large-scale astronomy observations. The infrastructure overlays 
image data from three complementary sky surveys (SDSS, FIRST, 
and simulated LSST results) and provides real-time interactive ca- 
pabilities to navigate the integrated datasets, analyze the spatial 
distribution of objects, and cross-correlate image fields. Addition- 
ally, Luciani et al. described interactive trend images, which are 
pixel-based, compact visual representations that help users identify 
trends and outliers among large collections of spatial objects. 


ESASky [BGR* 16], developed by the ESA Center Science Data 
Center, is a web application designed for three use cases: the explo- 
ration of multi-wavelength skies, the search and retrieval of data for 
single or multiple targets, and the visualization of sky coverage for 
all ESA missions. The end result is a “Google Earth for space”, ef- 
fectively combining the vast collection of data hosted by the ESA 
and providing an annotated map of the Universe that facilitates data 
querying and exploration across multiple data sources. 
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simulated, and observational data. 


LSSGalPy [AFPR* 17] emphasizes the exploration of the large- 
scale structures surrounding galaxies and visualizes isolated galax- 
ies, isolated pairs, and isolated triplets in relation to other galaxies 
within their large-scale structures. The paper describes one use case 
that investigates the effect of local and large-scale environments on 
nuclear activity and star formation, and another use case that vi- 
sualizes galaxies with kinematically decoupled stellar and gaseous 
components, including an estimation of the tidal strength that af- 
fects each galaxy. 


The Cosmicflows project aims to reconstruct and map the struc- 
ture of the local universe, providing a series of catalogs that mea- 
sure galaxy distances and velocities [TCD* 13]. Supporting this 
project, Pomarede et al. [PCHT17] provided four “cosmography” 
use cases for the SDvision visualization software, focusing on the 
creation of animations and interactive 2D and 3D visualizations of 
scalar and vector fields found in catalogs of galaxies, mapping cos- 
mic flows, representing basins of attraction, and viewing the Cos- 
mic V-web [PHCT17]. Pomarede et al. also explored the use of 
Sketchfab, a web-based interface that enables the uploading and 
sharing of 3D models that can be viewed in virtual reality. 


The vast scales present in astronomical datasets can be diffi- 
cult to render and present simultaneously. Klashed et al. [KHE* 10] 
introduced the “ScaleGraph” concept to deal with imprecision in 


rendering in the Uniview software. Hansen et al. [FHO7] utilized 
power-scaled coordinates to cover the distance ranges. More re- 
cently, Axelsson et al. [ACS*17] presented a way to enable fast 
and accurate scaling, positioning, and navigation without a signif- 
icant loss of precision, which they call the dynamic scene graph. 
At the core of this technique is the dynamic reassignment of the 
camera to focus on the object of interest, which then becomes the 
origin of the new coordinate system, ensuring the highest possible 
precision. Axelsson et al. applied this technique in the open-source 
software OpenSpace. 


OpenSpace [BAC* 20] is a software system that enables the in- 
teractive exploration of a multitude of available astronomy datasets 
(Fig. 4). It is designed to be robust enough to support educational 
and outreach activities as well as adaptable enough to allow for the 
incorporation of new data or analysis tools to support scientific re- 
search. For the first task, Openspace has already demonstrated suc- 
cess in science communication at museums and in planetariums. 
For the second task, OpenSpace’s ability to interface with tools 
such as Glue [GBR18] or Aladin exemplifies a growing paradigm 
in astronomy visualization: the combination of multiple available 
tools to complete a task rather than building a bespoke system 
from the ground up. OpenSpace exhibits several novel features, in- 
cluding multi-resolution globe browsing [BAB* 18], which enables 
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dynamic loading of high-resolution planetary surface textures and 
physically based rendering of planetary atmospheres [CBE*21]. 


Figure 4: OpenSpace: time-varying corona mass ejection simula- 
tion with 3D rendering and field lines. Image reproduced from Bock 
et al. [BAC* 20]. 


Gaia Sky [SJMS19] is an open-source, 3D universe explorer that 
enables users to navigate the stars of our galaxy from the Gaia Cat- 
alog (Gaia data release 2). It also aids in the production of outreach 
material. The system embeds stars in a multi-scale octree structure, 
where, at different levels, stars with various absolute brightness val- 
ues are present. The system contains a floating camera for space 
traversal, integrated visualization of relativistic effects, real-time 
star movement, and simulates the visual effects of gravitational 
waves. The main strength of Gaia Sky is its capability to provide 
real-time interactive exploration for hundreds of millions of stars. 
Its efficient handling of the data allows it to manage a large range 
of scales with sufficient numerical precision. 


Vohl et al. [VBF* 16] presented Encube to accelerate the visual 
discovery and analysis process of large data cubes in medical imag- 
ing and astronomy (Fig. 5). Encube can be used on a single desktop 
as well as the CAVE2 immersive virtual reality display environ- 
ment. In the CAVE2 environment, Encube enables users to control 
and interact with a visualization of over 100 data cubes across 80 
screens. The design focuses on comparative visualization and re- 
lated user interactions, such as swapping screens and requesting 
quantitative information from the selected screens. It uses a dis- 
tributed model to seamlessly process and render visualization and 
analysis tasks on multiple data cubes simultaneously. Additionally, 
Encube serializes the workflow and stores the data in the JSON for- 
mat, so that the discovery process can be reviewed and re-examined 
later. A desktop version of Encube supports many of the same func- 
tionalities as it does in the CAVE2 environment. Combined with the 
recording of the discovery process, researchers can continue with 
their workflow when they return to their desktops. 


Recognizing that FITS images were inherently complex, and that 
existing FITS viewers were not built with an optimal user expe- 
rience in mind, Muna [Mun17] introduced Nightlight, an “easy to 
use, general purpose, high-quality” viewer. Nightlight uses detail- 
on-demand to provide a high-level view of the file structure upon 
loading, and allows quick exploration of the data. Instead of reduc- 
ing the dynamic range of astronomical data while visualizing FITS 


images, Nightlight leverages its approach on the fact that the input 
image is likely astronomical data. It provides two modes for the 
astronomers — hyperbolic sine function scaling for bright features 
(e.g. stars), and linear scaling for faint features (e.g., nebulae). For 
FITS tables, Nightlight provides two views. The first is a grid of 
“cards”, where each card represents the metadata of a single col- 
umn in the table. The “cards” view is complemented by a second 
view in which the user can find the details of the full table. 


Since its introduction, TOPCAT [Tay05] has been widely used to 
view, analyze, and edit tabular data in the astronomy community. 
In additional to the generic tasks such as sorting rows, computing 
statistics of columns, and cross-matching between tables, TOPCAT 
also provides astronomy specific functionalities including the ac- 
cess to Virtual Observatory data, handling of various coordinate 
systems, and joining tables based on sky positions [Tay17a]. Over 
the past decade, the developers of TOPCAT have continued to im- 
prove its capabilities. Taylor [Tay14] described a rewrite of the plot- 
ting library added to TOPCAT v4, which is designed to improve re- 
sponsiveness and performance of the visualization of large datasets. 
One important new feature is the hybrid scatter plot/density map, 
see Fig. 6, that enables users to navigate interactively between the 
high- and low-density regions without changing plot types. 


Taylor [Tay17a] described the exploratory visualization capabil- 
ities of TOPCAT, which include high-dimensional plotting, high- 
density plotting, subset selection, row highlighting, linked views, 
and responsive visual feedback. Apart from the GUI application, 
users can also access TOPCAT from a set of command-line tools. 


4.2. Hierarchical Data Exploration 


Scherzinger et al. [SBD* 17] proposed a unified visualization tool 
based on Voreen [MSRMH09] that supports the interactive explo- 
ration of multiple data layers contained within dark matter simula- 
tions. These simulations contain only dark matter particles, in con- 
trast to also including gas and stars. Scherzinger’s visualization en- 


Figure 5: Comparative visualization of 20 galaxy morphologies 
with Encube [VBF* 16]. Image reproduced from “Large-scale com- 
parative visualization of sets of multidimensonal data”, written 
by Dany Vohl, David G. Barnes, Christopher J. Fluke, Govinda 
Poudel, Nellie Georgiou-Karistianis, Amr H. Hassan, Yuri Benovit- 
ski, Tsz Ho Wong, Owen L. Kaluza, Toan D. Nguyen, and C. Paul 
Bonnington, published in the Peer] Computer Science journal. Link 
to article: https://peerj.com/articles/cs-88/. 
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ables users to view the global structure of the data through 2D and 
3D volume rendering and particle rendering, and the time-varying 
properties of the data through a merger tree visualization. Local 
structures are explored further through local particles visualization 
and the integration with Galacticus, an open-source semi-analytic 
model that computes information about galaxy formation based on 
merger tree hierarchies of dark matter halos [Ben12]. An important 
aspect of their approach is scalable volume rendering, where the 
distribution of dark matter is visualized at interactive frame rates 
based on a pre-processing conversion. During such a conversion, 
attributes of large-scale particle data are distributed over a voxel 
grid, and maximum intensity projection in the 3D view is computed 
to highlight high-density regions of the data for volume rendering. 


Other tools also focus on exploring the evolution of galaxy halos 
within simulation datasets. Hazarika et al. [HWMB15] presented a 
series of visualizations to provide insight into halos, including a 3D 
volume rendering of simulation data and a particle rendering that 
identifies halo sub-structures. Almryde and Forbes [AF15] intro- 
duced an interactive web application to created animated “traces” 
of dark matter halos as they move in relation to each other over 
time, and Hanula et al. [HPU*15] presented the Cavern Halos 
project that enables the exploration of halos in virtual reality using 
the CAVE2 immersive collaboration space (this project was later 
extended and renamed DarkSky Halos [HPAM19]). See also the 
discussion of work by Preston et al. [PGX* 16] in Sect. 5. 


In order to better investigate the nature of solar wind ion data 
(SWID), which is typically visualized using 1D and 2D methods, 
Zhang et al. [ZST11] developed a 3D visualization method for 
SWID based on the Selenocentric Solar Ecliptic coordinate system, 
and integrated this method into an interactive tool called viSWIDs. 


Figure 6: TOPCAT: Hybrid scatter plot/density map [Tay17a]. 
Image reproduced from “TOPCAT: Desktop Exploration of Tab- 
ular Data for Astronomy and Beyond”, written by Mark Taylor, 
and published in the Informatics journal. Link to article: https: 
//doi.org/10.3390/informatics4030018. 


vtSWIDs enables researchers to browse through numerous records 
and provides statistical analysis capabilities. 


Breddels et al. [BV18] introduced Vaex, a Python library that 
handles large tabular datasets such as the Gaia catalogue. Many 
packages in Vaex are developed with specific visualization chal- 
lenges in mind, and they overcome the scalability issues with meth- 
ods such as efficient binning of the data, lazy expressions, and just- 
in-time compilation. For example, vaex-core provides visualization 
using the matplotlib library, with 1D histograms and 2D density 
plots; vaex-jupyter embeds the visualization tools in a web browser, 
which offers more user interactions such as zooming, panning, and 
on-plot selection. It also enables 3D volume and iso-surface ren- 
dering using ipyvolume and connecting to a remote server using 
WebGL. A standalone interface is provided by the vaex-ui package, 
which supports interactive visualization and analysis. The vaex- 
astro package is specifically designed for astronomical data, sup- 
porting the FITS format and the most common coordinate transfor- 
mations needed for analysis in astronomical data. 


To enhance the study of astronomical particle data, the work by 
Yu et al. [YEII12] was motivated by the need for an enhanced spa- 
tial selection mechanism using direct-touch input for particle data 
such as numerical simulations of the gravitational processes of stars 
or galaxies. They introduced two new techniques, TeddySelection 
and CloudLasso, to support efficient, interactive spatial selection in 
large particle 3D datasets. Their selection techniques automatically 
identify bounding selection surfaces surrounding the selected par- 
ticles based on the density. They applied their techniques to par- 
ticle datasets from a galaxy collision simulation (http://www. 
galaxydynamics.org) and an N-body mass simulation from the 
Aquarius Project [SWV*08], thus reducing the need for complex 
Boolean operations that are part of traditional multi-step selection 
processes. In a follow-up work [YEII16], Yu et al. further enhanced 
their 3D selection techniques to aid the exploratory analysis of as- 
tronomical data. They proposed a collection of context-aware se- 
lection techniques (CAST) that improve the usability and speed of 
spatial selection, and applied their methods to a cosmological N- 
Body simulation and Millennium-II dataset [SWJ*05]. 


The 2019 SciVis contest proposed a visual analysis challenge 
to explore the structure formation in the cosmic evolution. The 
dataset was from a CRK-HACC (HACC: Hardware/Hybrid Ac- 
celerated Cosmology Code) cosmological simulation containing 
dark matter plus baryon particles in a cubic box, where the par- 
ticles contain multiple fields such as position, velocity, and tem- 
perature. The simulations were used to study the impact that the 
feedback from AGN (Active Galactic Nuclei) has on their sur- 
rounding matter distribution. The entries from the contest (e.g., 
[FRG19, HSS* 19, NNPD19, SMG* 19]) represented a diverse col- 
lection of visualizations, made possible by these new forms of sim- 
ulation datasets. 


5. Feature Identification 


Research efforts in this category visually guide the identification 
and extraction of features of interest. The term “feature” is broad 
and can be used in a number of different astrophysical contexts. 
The detection of features in an astrophysical datastream is of crit- 
ical importance since many interesting phenomena are diffuse or 
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observed with a low signal-to-noise ratio. For example, physical 
phenomena may be subtle to detect (or may be detected for the first 
time), and distinguishing between what is signal and what is noise 
is critical. Teasing out a tiny signal is so common in astronomy 
that feature detection is a generically important element of astro- 
physical progress. Furthermore, astrophysicists are often looking 
for diffuse physical contrasts in multiple dimensions (e.g. spatial, 
chemical, magnetic, density). For these phenomena, methods that 
establish robust criteria in multiple dimensions for identification 
and subsequent analysis are crucial. The majority of these papers 
focus on dark matter simulations and the cosmic web, in particular 
voids, filaments, and dark matter halos, as summarized in Table 3. 
The cosmic web refers to the large-scale structure of matter, dis- 
tributed in filaments, the gravitationally collapsed structures that 
tend to connect galaxy halos, and voids, the low-density areas of 
the Universe. 


Visualizing dark matter simulations and cosmic web. Papers in 
this subsection employ various visualization techniques to visualize 
dark matter simulations and cosmic web, including GPU-assisted 
rendering with a tailored tessellation mesh [KHA12], tomographic 
map [RAW* 20], and interactive visual exploration of cosmic ob- 
jects [PGX* 16, SXL* 14]. 


Dark matter generates small-scale density fluctuations and plays 
a key role in the formation of structures in the Universe. Kaehler 
et al. [KHA12] visualized N-body particle dark matter simulation 
data using GPU-assisted rendering approaches. Their method lever- 
ages the phase-space information of an ensemble of dark matter 
tracer particles to build a tetrahedral decomposition of the compu- 
tational domain that allows a physically accurate estimation of the 
mass density between the particles [KHA12]. During the simula- 
tion, vertices of a tessellation mesh are defined by the dark matter 


particles in an N-body simulation, whereas tetrahedral cells contain 
equal amounts of mass. The connectivity within the mesh is gener- 
ated once and is kept constant over the simulation as the cells warp 
and overlap. The density of a given location in the simulation is 
obtained by considering the density contribution from overlapping 
cells in the region of interest. Their new approaches are shown to be 
effective in revealing the structure of the cosmic web, in particular, 
voids, filaments, and dark matter halos. 


The Lya forest, which is a series of individual over-densities of 
neutral hydrogen within the intergalactic medium (IGM, the space 
between galaxies), provides a 1D measurement of information in 
the IGM, which is largely correlated with the distribution of mat- 
ter in the Universe. Ravoux et al. [RAW* 20] used a tomographic 
reconstruction algorithm called the Wiener filtering to create a 3D 
tomographic map with the eBoss Strip p82 Lya forest datasets. The 
map is used as a representation of the associated matter fluctuation 
to identify over- and under-densities in the cosmic web. Extended 
over-densities can be detected with the tomographic map by search- 
ing for the large deficit in the Lya forest flux contrast. The authors 
adopt a simple-spherical algorithm to identify large voids. In order 
to further investigate the regions of interest, the paper provides 3D 
representations of the tomographic map over the entire strip. Users 
can interactively explore the map through rotating, panning, and 
zooming. 


Gravity causes dark matter particles to collapse into larger struc- 
tures over time. The individual groups of particles formed dur- 
ing this process are called halos, one of the most common ele- 
ments in the dark matter simulation [PGX*16]. Their evolution 
process and behaviors are often the focus of astronomical discov- 
eries. Two recent tools facilitate the visual exploration of halos. 
Shan et al. [SXL*14] built an interactive visual analysis system 
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that focuses on exploring the evolutionary histories of halos. The 
interface allows the user to manually select regions of interest in 
3D space. It then uses the marching cubes algorithm to perform 
iso-surface extraction and cluster separation based on the region’s 
density distribution. To prevent overlaps in the 3D space, the sys- 
tem employs multi-dimensional scaling (MDS) to project the halos 
into 2D space. Multiple linked views are generated to support the 
exploration through time. In addition to a merger tree view that 
is commonly used to visualize evolution of objects over time, Shan 
et al. proposed a unique particle trace path graph (see Fig. 7), which 
encodes the evolution history of selected particles. 


Preston et al. [PGX* 16], on the other hand, aimed to increase 
the efficiency and interactions in studying the evolution of halos, 
described by merger trees. Their integrated visualization system 
consists of a merger tree view, a 3D rendering view, and a quan- 
titative analysis view. Their merger tree view is an enhancement 
from [SXL* 14] with more interactive capabilities. The system al- 
lows users to select specific halos through the merger tree and or- 
ganize the tree based on other physical variables such as velocity 
and mass. The 3D rendering view displays the particles’ physical 
behaviors over a number of time steps, providing additional con- 
textual information for the merger tree. A remote parallel renderer 
is employed to improve the scalability of the rendering process. Fi- 
nally, the quantitative analysis view extends the other two views 
by providing quantitative information of selected particles that re- 
veals additional insights into the behavior of the halo. For instance, 
a chronological plot visualizes the anomalous events automatically 
detected in the history of a halo. An important feature of the sys- 
tem is that it enables simultaneous exploration of heterogeneous 
cosmology data; see Sect. 4 for further discussions. 


The IllustrisTNG project (https: //www.tng-project.org/) 
contains collections of large, cosmological magnetohydrodynami- 
cal simulations of galaxy formation. It is designed to “illuminate 
the physical processes that drive galaxy formation". The tool pro- 
vides a number of volume rendering capabilities to visually demon- 


Figure 7: An example of a particle trace path. Image reproduced 
from Shan et al. [SXL* 14]. 


strate the multi-scale, multi-physics nature of the simulations, as 
well as to perform qualitative inspections [PSD17]. 


Moving from clusters of galaxies to the spaces between them, the 
IGM is composed of gas complexes in the spaces between galax- 
ies. Although it has research values on its own, investigating IGM 
along with quasar sightlines helps put IGM in context. A quasar is 
a supermassive blackhole at the center of a galaxy that is accreting 
gas at a high rate and is therefore very bright. It enables scientists 
to associate certain absorption features with galactic environment, 
such as the circumgalactic medium (CGM), which is the gaseous 
envelope surrounding a galaxy. IGM-Vis [BAO* 19] is a visualiza- 
tion software specifically designed to investigate IGM/CGM data. 
It supports a number of identification, analysis, and presentation 
tasks with four linked views. The Universe panel provides a 3D 
interactive plot of galaxies in circles and quasar sightlines in cylin- 
drical “skewers”. The user can select a galaxy of interest to further 
examine it in the galaxy panel, which contains a list of attributes 
and corresponding data from SDSS. Additionally, quasar sightlines 
can be explored in the spectrum panel where multiple spectral plots 
can be displayed and stored. The final equivalent width plot panel 
facilitates dynamic correlation analysis and helps users discover ab- 
sorption patterns in the regions of interest. The four views comple- 
ment each other to streamline the discovery processes, including 
the identification of foreground and sightline features, the measure 
of absorption properties, and the detection of absorption patterns. 


Blazars — similar to quasars, an active galactic nuclei with rela- 
tivistic jets ejecting toward the Earth — are one of the most attrac- 
tive objects for astronomers to observe. The TimeTubes visualiza- 
tion [XNW* 16] transforms time-varying blazar data and polariza- 
tion parameters into a series of ellipses arranged along a time line, 
forming a volumetric tube in 3D space. The most recent iteration 
of the project, TimeTubesX [SUB* 20], includes feature identifica- 
tion techniques to detect recurring time variation patterns in blazar 
datasets. It includes an automatic feature extraction functionality to 
identify time intervals that correspond to well-known blazar behav- 
iors, as well as dynamic visual query-by-example and query-by- 
sketch functionality. Such a functionality enables users to search 
long-term observations that are similar to a selected time interval 
of interest, or match a sketch of temporal pattern. The technique 
aims to enhance the reliability of blazar observations, and to iden- 
tify flares, rotations, and other recurring blazar patterns in order to 
validate hypotheses about observable, photometric, and polarimet- 
ric behaviors. 


To study the agreements and disparities of feature identifica- 
tion methods created for classifying the cosmic web, Libeskind 
et al. [LvdWM17] collected 12 representative methods and applied 
them to the same GADGET-2 dark matter simulation. They clas- 
sified the dark matter density field of the cosmic web into knots, 
filaments, walls, and voids. They used comparative visualization 
accompanied with a variety of 2D plots to provide intuitive repre- 
sentations of the different structures identified by these methods. 
We introduce one of the topology-based methods with a strong vi- 
sualization component in the next subsection. 


Topology-based feature extraction. There are several examples of 
using topological techniques to extract cosmological features from 
simulations, in particular, galaxy filaments, voids, and halos. Topo- 
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logical methods have also been applied to observational data cubes. 
We believe that the integration of topological techniques in astro- 
nomical feature extraction and visualization will be a growing area 
of interest (see Sect. 8). 


Sousbie [Soul 1] presented DisPerSE, a topology-based formal- 
ism that is designed to analyze the cosmic web and its filamen- 
tary structure. It leverages discrete Morse theory and computes a 
Morse-Smale complex (MSC) on a density field. The MSC is then 
simplified using the theory of persistent homology by canceling the 
topological features with low persistence values (i.e., those that are 
likely generated by noise). The relationship between the topologi- 
cal and geometrical features is easily detectable in the MSC, where 
the ascending 3-manifolds correspond to the voids, ascending 2- 
manifolds to the walls, and ascending 1-manifolds to the filaments. 
The technique is scale-free, parameter-free, and robust to noise. 
Sousbie et al. then demonstrated the effectiveness of DisPerSE at 
tracing cosmological features in 2D and 3D datasets [SPK11]. 


Following a similar path, Shivashankar et al. [SPN* 16] proposed 
Felix, another topology-based framework that identifies cosmolog- 
ical features (see Fig. 8). Felix focuses on extracting the filamen- 
tary structures and incorporates a visual exploration component. It 
also computes a MSC over a density field and simplifies it by it- 
eratively canceling pairs of simplices, which generates a hierarchy 
of MSCs. Realizing that it is nearly impossible to find a version 
of the MSC within the hierarchy that best separates noise and fea- 
tures for cosmology datasets, Felix allows users to query for specific 
density ranges across all generated MSCs. This process increases 
user engagement in the parameter selection process and helps pre- 
serve filament structures within void-like or cluster-like regions. 
Felix also utilizes 3D volume rendering to interactively guide the 
selection of parameters for the query and visualizes the extracted 
filaments along with the density field. Interactive visual exploration 
of these intricate features remains a challenging and largely unex- 
plored problem. 


Recently, a new method has been proposed by Tricoche 
et al. [TSH21] to extract the topology of the Poincaré map in 
the circular restricted three-body problem (CR3BP). They created 
an interactive visualization of the topological skeleton to support 
spacecraft trajectory designers in their search for energy-efficient 
paths through the interconnected web of periodic orbits between 
celestial bodies. The new method extends the existing approach by 
Schlei et al. [SHTG14], and significantly improves the results of 
fixed point extraction and separatrices construction. In order to re- 
duce the high computational cost, Tricoche et al. pre-screened for 
impractical spaceflight structures, and leveraged previous knowl- 
edge on the accuracy limitations of sensors and engines to impose 
restrictions on certain parameters. These adjustments reduce the 
computational workload of the method and enable the interactive 
visualization of the topology. The visualization displays the fixed 
points identified by the system and each individual selected orbit 
as a closed curve. The visualization also enables a manifold arc se- 
lection mechanism to help the trajectory designer to determine the 
precise path a spacecraft would need to follow from any arbitrary 
location. 


From an observational perspective, current radio and millime- 
ter telescopes, particularly ALMA, are producing data cubes with 


significantly increased sensitivity, resolution, and spectral band- 
width. However, these advances often lead to the detection of 
structure with increased spatial and spectral complexity. Rosen 
et al. [RSM* 19] performed a feasibility study for applying topolog- 
ical technique — in particular, contour trees — to extract and simplify 
the complex signals from noisy ALMA data cubes. They demon- 
strated the topological de-noising capabilities on a NGC 404 data 
cube (also known as Mirach’s Ghost) and a CMZ (Central Molec- 
ular Zone) data cube. Using topological techniques, Rosen et al. 
sought to improve upon existing analysis and visualization work- 
flows of ALMA data cubes, in terms of accuracy and speed in fea- 
ture extraction. 


Feature extraction from astronomy data cubes. In addition to the 
work by Rosen et al. [RSM* 19], other visualizations of integral 
field spectrometer (IFS) data cubes have been proposed. Camp- 
bell et al. [CKA12] presented a 3D interactive visualization tool 
specifically designed to render IFS data cubes. A typical display 
tool reduces a 3D IFS datacube to 2D images of either the spatial 
or the wavelength dimension. Campbell et al. proposed to use vol- 
ume rendering instead to highlight features and characteristics of 
astronomical objects that are difficult to detect in lower dimension 
projections. The tool, known as OsrsVol, allows users to easily ma- 
nipulate the visualized data cube by interactions such as zooming, 
rotating, and aspect ratio adjustment. 


Ciulo et al. [CCM*20] used OsrsVol to identify four objects 
orbiting the supermassive black hole at the center of our galaxy 
Sagittarius A”. Two unusual objects have been recently discovered 
around Sagittarius A’, referred to as the G sources, and their pos- 
sible tidal interactions with the black hole have generated consid- 
erable attention. Ciulo et al. selected 24 relevant data cubes and 
processed them through the OSIRIS pipelines. They analyzed the 


Figure 8: Felix: Extracting filamentary structures (orange) from a 
Voronoi evolution time-series dataset. Image reproduced from Shiv- 
ashankar et al. [SPN* 16]. 
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data cubes with OsrsVol, as well as several conventional 1D/2D 
visualization tools. OsrsVol helps to disentangle the various dimen- 
sions of data cubes and allows more flexible explorations among 
crowded regions. Using OsrsVol, Ciulo et al. also characterized the 
best-fit orbits of the four new objects, and determined that they ex- 
hibited many traits in common with the previously discovered G 
sources. 


Feature identification with deep learning. We end this section 
by giving a couple of examples of using neural network models 
as feature extractors for unsupervised clustering of galaxies. These 
works demonstrate the potential of using deep learning in feature 
identification tasks, for which both astronomers and visualization 
experts are cautiously excited. 


Aragon-Calvo was the first to apply a deep convolutional neu- 
ral network to the task of semantic segmentation of the cosmic 
web [AC19]. He proposed a network with a U-net architecture and 
trained the model using a state-of-the-art manually guided segmen- 
tation method. Two types of training datasets were generated us- 
ing the standard Voronoid model and an N-body simulation. Their 
method provides exciting results as it efficiently identifies filaments 
and walls with high accuracy for well-structured data such as the 
Voronoid model. For more complex datasets such as the N-body 
simulation, the U-net achieves higher quality segmentation than the 
state-of-the-art methods. 


Khan et al. [KHW* 19] constructed galaxy catalogs using trans- 
fer learning. They employed a neural-network-based image clas- 
sifier Xception, pre-trained on ImageNet data, to classify galaxies 
that overlap both Sloan Digital Sky Survey (SDSS) and Dark En- 
ergy Survey (DES) surveys, achieving state-of-the-art accuracy of 
99.6%. Khan et al. then used their neural network classifier to la- 
bel and characterize over 10,000 unlabelled DES galaxies, which 
do not overlap previous surveys. They further extracted abstract 
features from one of the last layers of their neural network and 
clustered them using t-SNE, a dimensionality reduction technique. 
Their clustering results revealed two distinct galaxy classes among 
the unlabelled DES images based on their morphology. The anal- 
ysis of Khan et al. provides a path forward in creating large-scale 
DES galaxy catalog by using these newly labelled DES galaxies as 
data for recursive training. 


Galaxy clusters are gravitationally bound systems that contain 
hundreds or thousands of galaxies in dark matter halos [NZE* 19], 
with typical masses ranging from 10!* to 10!> solar masses. Ntam- 
paka et al. applied deep learning to estimate galaxy cluster masses 
from Chandra mock — simulated, low-resolution, single-color X- 
ray images [NZE* 19]. They used a relatively simple convolutional 
neural network (CNN) with only three convolutional and pooling 
layers followed by three fully connected layers. Despite the simple 
framework, the resulting estimates exhibit only small biases com- 
pared to the true masses. The main innovation of the paper is the 
visual interpretation of the CNN, using an approach inspired by 
Google’s DeepDream, which uses gradient ascent to produce im- 
ages that maximally activate a given neuron in a network. Ntam- 
paka et al. used gradient ascent to discover which changes in the 
input cause the model to predict increased masses. They found that 
the trained model is more sensitive to photons in the outskirts of the 
clusters, and not in the inner regions; and their observations aligned 


with other statistical analyses performed on galaxy clusters. Their 
work illustrates the utility of interpreting machine learning “black 
boxes” with visualization since it provides physical reasoning to 
predicted features. 


6. Object Reconstruction 


Research works in this category provide informative visual repre- 
sentation of astronomical objects; see Table 4 for their fine-grained 
classifications under secondary and tertiary categories, where there 
is a strong focus on observational data. Object reconstruction uti- 
lizes and is also constrained by imagery and other observational 
data obtainable via our vantage point — the Earth and the solar sys- 
tem. The works surveyed here cover 3D object reconstructions us- 
ing 2D images [SKW* 11, WAG* 12, WLM 13, HA20], distances of 
young stellar objects [GAM* 18], spectroscopic data [VD11], and 
extrapolation from sparse datasets such as SDSS [EBPF21], where 
visualization helps produce plausible reconstructions that provide 
structural insights for analysis and modeling. Important challenges 
include scalable computation, trade-off between automatic recon- 
struction and expert knowledge, and in particular, physically accu- 
rate structural inference with limited observations. 


As mentioned previously, we recognize that “objects” are, in 
fact, “features” with sharp and/or discontinuous contrast in a di- 
mension of scientific interest. Whether a specific aspect of a dataset 
is considered an “object” or a “feature” depends on the scien- 
tific question posed. We separate object reconstruction from feature 
identification to be compatible with the literature, but we envision 
a future where these entities are recognized as a continuum. An ex- 
ample of such a continuum is Polyphorm [EBPF21], where the fil- 
ament reconstruction and interactive visualization are intertwined 
via a fitting session, where structural or visual parameters are ad- 
justed interactively to produce satisfactory reconstruction results. 


Object reconstruction employs both images and other observa- 
tional data, and thus is closely related to image reconstruction in 
astronomy. As discussed in Sect. 1, we do not consider state-of-the- 
art image reconstruction methods in astronomy based on optimiza- 
tions or signal processing techniques, but rather, we will focus on 
reconstruction with modern visualization techniques, such as 3D 
object reconstruction, 3D rendering, and interactive visualization. 
There is existing literature on the “historic account” of astronomi- 
cal image reconstruction [Dai85,PGY05], recent surveys about this 
field [TA16], and machine learning approaches [Fla17]. 


3D object reconstruction from 2D images. Steffen 
et al. [SKW*11] presented Shape, one of the first publicly 
available tools using interactive graphics to model astronomical 
objects. Shape allows astrophysicists to interactively define 3D 
structural elements using their prior knowledge about the object, 
such as spatial emissivity and velocity field. Shape provides a 
unified modeling and visualization flow, where physical knowledge 
from the user is used to construct and iteratively refine the model, 
and model parameters are automatically optimized to minimize 
the difference between the model and the observational data. The 
interactive feedback loop helps introduce expert knowledge into 
the object reconstruction pipeline and has proven to be incredibly 
useful for many applications, such as rendering hydrodynamical 
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Table 4: Classifying papers under object reconstruction based on secondary and tertiary categories. Top row, from left to right: (primary 
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simulations, reconstructing Saturn Nebula, modeling the structure 
and expansion of nova RS Ophiuchi [SK W* 11]. Shape also comes 
with educational potential in digital planetariums. 


Wenger et al. [WAG* 12] developed an automatic 3D visualiza- 
tion of astronomical nebulae from a single image using a tomo- 
graphic approach. Their 3D reconstruction exploits the fact that 
many astronomical nebulae, interstellar clouds of gas and dust, ex- 
hibit approximate spherical or axial symmetry [MKDH04]. This 
symmetry allows for object reconstruction by replicating multiple 
virtual viewpoints based on the view from Earth. This assemblage 
of different views results in a tomographic reconstruction problem, 
which can be solved with an iterative compressed sensing algo- 
rithm. The reconstruction algorithm relies on a constrained opti- 
mization and computes a volumetric model of the nebula for in- 
teractive volume rendering. Wenger et al. demonstrated that their 
method preserves a much higher amount of detail and visual variety 
than previous approaches. However, they also noted that the quality 
of their reconstruction is limited by the fact that “the algorithm has 
no knowledge about the physical processes underlying the objects 
being reconstructed”, and suggested restricting the search space to 
solutions compatible with a physical model [WAG* 12]. 


In a follow-up work, Wenger et al. [WLM13] presented an al- 
gorithm based on group sparsity that dramatically improves the 
computational performance of the previous approach [WAG* 12] 
(see Fig. 9). Their method computes a single projection instead of 
multiple projections and thus reduces memory consumption and 
computation time. It is again inspired by compressed sensing: an 
foo group sparsity regularizer is used to suppress noise, and an 42 
data term is used to ensure that the output is consistent with the ob- 
servational data [WLM13]. This method enables astronomers and 
end users in planetariums or educational facilities to reconstruct 
stellar objects without the need for specialized hardware. 


Hasenberger et al. [HA20] added to the hallowed pantheon of au- 
tomatic object reconstruction algorithms with AVIATOR: a Vienna 
inverse-Abel-transform-based object reconstruction algorithm. Ex- 
isting reconstruction techniques (e.g., [WAG* 12,WLM13]) contain 
potentially problematic requirements such as symmetry in the plane 
of projection. AVIATOR’s reconstruction algorithm assumes that, 


for the object of interest, its morphology “along the line of sight 
is similar to its morphology in the plane of the projection and that 
it is mirror symmetric with respect to this plane” [HA20]. Hasen- 
berger et al. applied AVIATOR to dense molecular cloud cores and 
found that their models agreed well with profiles reported in the 
literature. 


3D object reconstruction using stellar object distances. The 
Gaia data release 2 (Gaia DR2) contains a wealth of information 
about the night sky. Gro&schedl et al. [GAM* 18] used the distances 
of 700 stellar objects from this dataset to infer a model of Orion A 
that describes its 3D shape and orientation. This 3D model leads to 
many insights, among them that the nebulae is longer than previ- 
ously thought and that it has a cometary shape pointing toward the 
Galactic plane, where the majority of the Milky Way’s disk mass 
lies. The authors pointed out that Gaia is bringing the critical third 


Figure 9: A fast reconstruction algorithm that creates 3D models 
of nebulae based on their approximate axial symmetry. Image re- 
produced from Wenger et al. [WLM13]. 
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spatial dimension to infer cloud structures and to study start-form 
interstellar medium. 


In a similar manner, Skowron et al. [SSM* 19] constructed a 3D 
map of the Milky Way galaxy, using the positions and distances 
of thousands of classical Cepheid variable stars, which in turn are 
obtained through observations and accounting of the stars’ pulsat- 
ing periods coupled with luminosity. Cepheid variable are regularly 
pulsating stars, where their regular pulsations allow us to calculate 
their distances precisely. Skowron et al. used 2341 such stars to 
sketch the Milky Way galaxy and observe the warped shape of the 
galactic disk, and they were able to define the characteristics of this 
warping with some precision. They visualized and performed addi- 
tional analysis on this 3D map using a combination of static 2D/3D 
plots. 


3D object reconstruction using spectroscopic data. Vogt 
et al. [VD11] aimed to characterize the 3D shape of a young 
oxygen-rich supernova remnant (N132D) in the Large Magellenic 
Cloud, a satellite dwarf galaxy of the Milky Way. Using spectro- 
scopic data from the Wide Field Spectrograph along with sophisti- 
cated data reduction techniques, they produced a data cube, which 
they used to construct a 3D map of the oxygen-rich ejecta of the 
supernova remnant of interest. They provided several different 2D 
and 3D plots showing unique views of this 3D map. Their visual 
analysis has led to insights about the structure of this supernova 
remnant beyond what was previously known. 


Dark matter filament reconstruction. Polyphorm [EBPF21] is 
an interactive visualization and filament reconstruction tool that 
enables the investigation of cosmological datasets (see Fig. 10). 
Through a fast computational simulation method inspired by the 
foraging behavior of Physarum polycephalum, astrophysicists are 
able to extrapolate from sparse datasets, such as galaxy maps 
archived in the SDSS, and then use these extrapolations to inform 
analyses of a wide range of other data, such as spectroscopic obser- 
vations captured by the Hubble Space Telescope. Researchers can 
update the simulation at interactive rates by a wide range of adjust- 
ing model parameters. Polyphorm has been used to reconstruct the 
cosmic web from galaxy observations [BET*20] and to infer the 
ionized intergalactic medium contribution to the dispersion mea- 
sure of a fast radio burst [SBP* 20]. 


Visual verification of simulations. Currently, predictions of the 
Sun’s Coronal mass ejections (CMEs) rely on simulations gener- 
ated from observed satellite data. CMEs are powerful eruptions 
from the surface of the sun. These simulations possess inherit un- 
certainty because that the input parameters are entered manually, 
and the observed satellite data may contain measurement inaccu- 
racies. These simulations treat CMEs as singular objects with dis- 
crete boundaries that are well defined and thus enable their treat- 
ment as entire objects. In order to mitigate this uncertainty, Bock 
et al. [BPM*15] proposed a multi-view visualization system that 
generates an ensemble of simulations by perturbing the CME input 
parameters, and enables comparisons between these simulations 
and ground truth measurements. The system has many capabilities 
useful to domain experts, including integration of 3D rendering of 
simulations with satellite imagery, comparison of simulation pre- 
dictions with observed data, and time-dependent analysis. 


3D visualization of planetary surfaces. Ortner et al. [OWN* 20] 
performed 3D reconstruction visualization for planetary geology. 
Their geological analysis of 3D Digital Outcrop Models is used 
to reconstruct ancient habitable environments, which serves as an 
important aspect of the upcoming ESA ExoMars 2022 Rosalind 
Franklin Rover and the NASA 2020 Rover Perseverance missions 
on Mars. They conducted a design study to create InCorr (Inter- 
active data-driven Correlations), which includes a 3D geological 
logging tool and an interactive data-driven correlation panel that 
evolves with the stratigraphic analysis. See [Ger14, Section 2.2.2] 
for more references on Mars geology and geodesy data and tools. 
Bladin et al. [BAB* 18] integrated multiple data sources and pro- 
cessing and visualization methods to interactively contextualize 
geospatial surface data of celestial bodies for use in science com- 
munication. 


7. Education and Outreach 


Currently, an on-going paradigm shift is occurring in scientific out- 
reach. Technological advances are enabling data-driven and inter- 
active exploration to be possible in public environments such as 
museums and science centers, increasing their availability to the 
general public. These advances are shortening the distance between 
research and outreach material, and enriching the scientific explo- 
ration process with new perspectives. Ynnerman et al. [YLT18] and 
Goodman et al. [GHWY 19] introduced the Exploranation concept, 
a euphemism encapsulating this confluence of explanation and ex- 
ploration. 


Scientific storytelling of astrophysical findings using visualiza- 
tion has a deep history. Ma et al. [MLF*12] described how vi- 
sualization can aid scientific storytelling using the NASA Scien- 
tific Visualization Studio. Borkiewicz et al. described storytelling 
based on data-driven cinematic visualization in a SIGGRAPH 
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Figure 10: Polyphorm: reconstruction of dark matter filaments in 
the simulated BolshoiPlanck dataset where Polyphorm yields a con- 
sistent 3D structure, enabling its calibration to cosmic over density 
values. Thin slices of the filament map are shown on the right. Im- 
age reproduced from Elek et al. [EBPF21]. 
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Table 5: Classifying papers under education and outreach based on secondary and tertiary categories. Top row, from left to right: (primary 
category) Education and outreach; (secondary categories) 2D/3D plots, 2D images, 3D rendering, interactive visualization, dimensionality 
reduction, uncertainty visualization, and new display platforms; (tertiary categories) extragalactic, galactic, planetary, and solar astronomy; 
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course [BCK* 19]. More recently, and with a greater focus on in- 
teractive methods where the user becomes part of the exploration, 
Bock et al. [BHY18] described the challenge of presenting the de- 
tails of NASA space missions to the public. 


Research efforts in this category (as summarized in Table 5) are 
related w.rt. important aspects of education outreach and/or pub- 
lic accessibility. In addition, several are concerned with large-scale 
immersive visualization in planetariums and also with personal vir- 
tual reality (VR) and augmented reality (AR) experiences. Absent 
from the current literature, to the best of our knowledge, is a com- 
prehensive analysis of the effect of VR for scientific exploration in 
astronomy. 


Planetarium and other large-scale immersive environments. 
Immersive visualization in planetarium dome theaters (see Fig. 11) 
has been the primary outreach mechanism for astronomy from 
their initial conception. The immersive nature of the display sys- 
tem plays an important role in the contextualization of the avail- 
able data, which is one of the unique challenges of astronomical 
datasets. The birth of the usage of interactive visualization software 
in planetarium can be traced to the Uniview software [KHE* 10], 
which pioneered many of the interaction paradigms that are still 
in use today. To a large extent, these live presentations based on 
interactive visualization are enabled by software provided by the 
planetarium vendors, which are, in general, commercial solutions 
and thus fall outside the scope of this survey. Our focus here is in- 
stead on the large number of open-source initiatives, which are eas- 
ily accessible to the academic community, targeting planetariums 
and other large-scale immersive environments. Although aimed at 
the use in immersive environments, these initiatives also constitute 
a bridge between outreach and research-driven data exploration, as 
described by Faherty et al. [FSW* 19], which is increasingly gain- 
ing momentum. 


Among the most widely used software packages tailored to as- 
trophysical data in immersive environments is WorldWide Tele- 
scope [RFG* 18], which is a virtual observatory for the public to 
share and view data from major observatories and telescopes. The 
software provides the capability to visualize the solar system and 
stars, and show observational material in context; however it fo- 


cuses on the data as displayed from the Earth’s viewpoint. Celes- 
tia is an open-source initiative that shows the objects of the solar 
system and the greater universe in a 3D environment. It provides 
high-resolution imagery and accurate positioning of the planetary 
bodies of the solar system and the ability to show other datasets in 
their context outside the solar system. OpenSpace, Gaia Sky, and 
ESASky, as described in Sect. 4, also provide contextualization of 
astronomical data, but with a stronger emphasis on the ability for 
domain experts to import their data into an immersive environment 
for public presentations. The Stellarium software can be used by the 
general public to look at a virtual nights sky from the surface of any 
planet. The data contained in the software include a star catalog, 
known deep space objects, satellite positions, and other datasets 
that can be added dynamically by the user. NASA Eyes is a suite 
of web-based visualization tools that enable the user to learn about 
the Earth, Solar System, Exoplanets, and ongoing NASA missions. 
While providing a rich experience for the end user, the avenues 
for extension are limited. The Mitaka software [USTY 18] enables 
users to explore the observable universe and makes it easy for them 
to create custom presentations that can be shown on a large variety 
of immersive display environments. 


Personal virtual and augmented reality. In stark contrast to the 
immersive display environments described thus far, a large amount 


Figure 11: An example of an interaction presentation in a large- 
scale immersive environment to the general public, in this case of 
topographical features on the surface of Mars present at the Bril- 
liant Minds conference. 
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of work has been presented in the realm of virtual reality (VR) and 
augmented reality (AR). VR in this context refers to a personal 
experience, rather than one that is shared with other participants. 
These two fields overlap in some areas, but they present distinct 
research challenges. 


With 3DMAP-VR, Orlando et al. [OPB* 19] sought to unite the 
excessively data-rich world of astronomy with VR with the hopes 
of facilitating productive engagement with traditionally inaccessi- 
ble data. Specifically, they visualized 3D magnetohydrodynamic 
(MHD) models of astronomical simulations, which include the ef- 
fects of gravity and hydrodynamics as well as magnetic fields. 
Their workflow consisted of two steps: obtaining accurate simu- 
lations and then converting these simulations to navigable 3D VR 
environments using available data analysis and visualization soft- 
ware. In additional to providing a method to explore these dense 
data cubes in VR, they allowed these VR environments to be ex- 
plored by anyone with a VR rig by uploading them to Sketchfab, 
a popular open platform for sharing VR content. Orlando et al. ex- 
celled at meeting two emerging goals in astronomical visualization: 
first, using existing software to achieve their goals rather than creat- 
ing something from scratch; and second, making the visualizations 
widely accessible. 


Arcand et al. [AJP*18] developed a 3D VR and AR pro- 
gram to visualize the Cassiopeia A (i.e., Cas A) supernova rem- 
nant, the resulting structure from an exploded star. They aimed to 
make the best use of the high-resolution, multi-wavelength, multi- 
dimensional astronomical data and give users the experience of 
walking inside the remains of a stellar explosion. They first per- 
formed 3D reconstruction of Cas A and then employed volume and 
surface rendering to display the model in the VR system YURT (i.e., 
Yurt Ultimate Reality Theatre). The user can select a specific part 
of the supernova model and access the annotations. These interac- 
tive features not only help non-experts engage in the story of the 
star, but also assist researchers observe changes in its properties. 


Vogt et al. [VS13] explored the potential of AR in astrophysics 
research/education by introducing Augmented Posters and Aug- 
mented Articles. The authors included Augmented Posters at the 
Astronomical Society of Australia Annual General meeting in 
2012. Incorporating AR into posters allowed attendees to use their 
smartphones to engage with the posters in a virtual space and eas- 
ily save and share the poster information if they found it interest- 
ing. Through tracking of engagement and feedback, they discov- 
ered that the majority of conference attendees found the technology 
to “have some potential.” As mentioned, the authors also experi- 
mented with Augmented Articles. They showed how results from 
an earlier work (the 3D structure of super nova remnants) can be 
viewed in 3D interactively within the article using a smartphone. 
Vogt et al. concluded by speculating on the future of AR in astro- 
physics. They were optimistic about the potential for AR to be an 
effective supplementary technology, but cited long-term stability 
and backwards compatibility in terms of AR apps and technology 
as a major limitation to AR moving forward. They suggested that a 
dedicated service for AR used in scientific publishing and outreach 
may be an effective way to handle this limitation. 


Novel interfaces. Madura [Mad17] presented a case study using 
3D printing to visualize the n Car Homunculus nebula, see Fig. 12. 


Extending the traditional monochromatic 3D prints, Madura pro- 
posed to use full-color sandstone prints to generate more infor- 
mative models. Although the sandstone material is not as sturdy, 
these printers produce noticeably higher quality prints that preserve 
smaller details. The colors of the prints can be based on physical 
properties, which provides additional information to visual learners 
and helps distinguish different structures. The 3D models not only 
facilitate research discoveries, but also help communicate scientific 
discoveries to non-astronomers, especially to the visually impaired. 
The New Mexico Museum of Space History and the New Mexico 
School jointly hosted the first week-long astronomy camp for the 
visually impaired students across states in the summer of 2015. The 
camp received overwhelmingly positive feedback. Madura also dis- 
cussed the use of other methods, including audio, tactile sign lan- 
guage, and tactile fingerspelling, to further expand the 3D model in- 
teractive experience for tactile learners. Overall, 3D printing could 
be a useful and effective tool for astronomy outreach and education. 
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Figure 12: Dual-color 3D print of the n Car Homunculus nebula. 
Image reproduced from Madura [Mad17]. 


In a work as artistic as it is scientific, Diemer et al. [DF17] 
modeled the cosmic web through dark matter simulations and rep- 
resented it artistically through 3D sculptures and woven textiles. 
The dark matter simulation is run using the publicly available code 
GADGET2 and the halos and subhalos (i.e., the halos within ha- 
los), are identified using ROCKSTAR. To identify the structural el- 
ements of the cosmic web (walls and filaments), they used DIS- 
PERSE, which is publicly available code that leverages discrete 
Morse theory to identify salient topological features (see Sect. 5). 
Converting from the simulation data to their artistic representation, 
Diemer et al. stated that “we believe that art, as much as science, 
seeks to say something true about the nature of existence, and that 
end is best served by artistic representation that grapples with real 
data and not only with allegorical concepts.” They accomplished 
this stated ideal through a structured simplification of the model to 
a form where it can be represented using 3D woven textiles. The 
techniques used in their paper are not novel, but the combination 
of them is, and the end result is a powerful installation that instills 
wonder in those who move through it. This work is able to take sci- 
entifically rigorous simulation data and represent it in an accessible 
form without losing the deep beauty of the underlying science. 


This elegant translation of numbers to forms, of thoughts to feel- 
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ings, lies at the heart of science communication and outreach. The 
importance of this translation is especially crucial for astronomy, 
where the physical embodiment of the things we are studying can 
really only ever live in our minds and as the modern equivalent of 
paint on the walls of our caves. 


Virtual observatories. Virtual observatories (VOs) are web-based 
repositories of astronomical data from multiple sources with the 
goal of improving access to astronomical data for scientists not di- 
rectly involved in data collection. Many advanced VO applications 
aid in the mining and exploration of observational data through the 
use of state-of-the-art visualization techniques, but comparatively 
few that perform similar functions for theoretical data. In the inter- 
active 3D visualization for theoretical VOs, Dykes et al. [DHG* 18] 
examined the current capabilities of VOs containing theoretical 
data, and additionally presented a tool based on SPLOTCH, which 
is designed to aid in addressing some of the shortcomings they 
identified with current methods. SPLOTCH is a visualization tool 
designed for high-performance computing environments that is ca- 
pable of quickly rendering large particle datasets, which makes it 
ideal for interactive visualization of 3D particle data. Dykes et al. 
combined their tool with a VO and demonstrated the effectiveness 
of interactively filtering and quantitatively visualizing the data for 
identifying features of interest in large particle simulations. Future 
steps involve comparative visualization, which would consist of 
methods to generate mock 2D observational images from the 3D 
simulation data to compare with actual observations. 


Broad dissemination through mobile applications. Furthermore, 
recent progress has been made on mobile and commercial appli- 
cations that provide visualizations of astronomical data. Although 
these applications are not focused on research questions, they serve 
to broadly disseminate astrophysics visualizations, many of which 
also have strong educational value for the general public. 


For instance, SkySafari (https://skysafariastronomy. 
com/), StarMap (http://www.star-map.fr/), NightSky 
(https://icandiapps.com/), or SkyView (on Apple App Store) 
bring stargazing to everyone’s mobile phones, offer AR features 
that guide the layperson towards interested targets, and can also 
be used to control amateur telescopes and aid in astrophotography. 
Many JavaScript libraries and tools are available for astronomical 
visualization, such as Asterank (https: //www.asterank.com/), 
based on spacekit.js (https: //typpo.github.io/spacekit/), 
which enables the user to visually explore a database containing 
over 600,000 asteroids, including estimated costs and rewards of 
mining asteroids. 


Web applications that introduce scientific knowledge of astron- 
omy are also easily accessible. For example, NASA JPL’s Eyes on 
the Solar System (https: //eyes.nasa.gov/) has the ability to 
show the dynamics of our solar system, but can also be used to 
show the evolution of space missions, and the discovery of exo- 
planets. Another example is an adaptation of the Uniview software 
to be used in schools on mobile platforms targeting grades 4-6 in 
the NTA Digital project. 


There are also a number of popular applications for VR head- 
sets, such as Star Chart (http: //www.escapistgames.com/) 
and Our Solar System on Oculus, which provide immersive 


experiences for users seeking knowledge about the Universe. 
Merge Cube (https: //mergeedu.com/cube) accompanied with 
AR/VR apps such as MERGE Explorer has been used to enable a 
new way of interactive learning for astronomy and beyond. It al- 
lows users to hold digital 3D objects and explore stars and galaxies 
in their palms. 


8. Challenges and Opportunities 


In addition to a taxonomy of existing approaches that utilize vi- 
sualization to study astronomical data from Sect. 3 to Sect. 7, our 
contribution includes a summary of the current challenges and op- 
portunities in visualization for astronomy. We ask the following 
questions: What are the missing tools in current visualization re- 
search that astronomers need to formulate and test hypotheses us- 
ing modern data? What visualization capabilities are expected to 
become available for astronomical data over the next decade? 


In a Carnegie + SCI mini-workshop conducted in April 2020 
and a Visualization in Astrophysics workshop during IEEE VIS in 
October 2020, astrophysicists and visualization experts discussed 
recent advances in visualization for astrophysics, as well as the cur- 
rent visualization needs in the astronomical community. As a result 
of these workshops, we have identified the following list of chal- 
lenges and opportunities: 


e Open-source tools: we need more open-source data visualization 
software that is suitable for astronomical data. These tools must 
be flexible, modular, and integrable within a broader ecosystem 
of workhorse tools; 

e Intelligent data querying: we need to enable intelligent data 
queries for large data; 

e Discovery: we need ways to turn high-quality renderings of data 
(observed and simulated) into quantitative information for dis- 
covery; 

e Scalable feature extraction: we need to extract and visualize fea- 
tures from large and physically complex data cubes; 

e In situ analysis and visualization: we need to interact with sim- 
ulation data in real time, by utilizing visualization for parameter 
tuning and simulation steering; 

e Uncertainty visualization: we need to develop more techniques 
to mitigate and communicate the effects of data uncertainty on 
visualization and astronomy; 

e Benchmarks: we need to develop clear, widely adopted bench- 
marks or mock data catalogs for comparison with observed data; 

e Time and space efficiency: we need to improve upon memory 
and/or space intensive data analysis tasks. 


8.1. Challenges Identified from Previous Surveys 


We first review the challenges identified from previous surveys 
[HF11, LLC*12] and describe how the community has responded 
to these challenges in the past decade. 


Hassan et al. [HF 11] identified six grand challenges in their 2011 
survey for the peta-scale astronomy era: 


e Support quantitative visualization; 
e Effectively handle large data sizes; 
e Promote discoveries in low signal-to-noise data; 
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e Establish better human-computer interaction and ubiquitous 
computing; 

e Design better workflow integration; 

e Encourage adoption of 3D scientific visualization techniques. 


Lipsa et al. [LLC* 12] discussed visualization challenges in astron- 
omy in their 2012 survey; however, only a few papers had ad- 
dressed these challenges at the time of the survey. These challenges 
include: 


e Multi-field visualization, feature detection, graphics hardware; 

e Modeling and simulation; 

e Scalable visualization, error and uncertainty visualization, time- 
dependent visualization, global and local visualization, and com- 
parable visualization. 


In the past decade, considerable advances have been made in 
addressing the challenges identified by Hassan et al. and Lipsa 
et al. With modern computing power, interactive visualizations 
have become increasingly popular for both scientific explorations 
and public outreach. A variety of scalable visualization tools 
for large simulation and survey data are now easily accessi- 
ble (e.g., ParaView [WHA*11], OpenSpace [BAC*20] and Gaia 
Sky [SJMS19]). Many tools are also adopting graphics hardware 
and parallelism in their visualization, rendering, and analysis pro- 
cesses to increase efficiency (e.g., yt [TSO* 10] and [MPF20]). Sci- 
entists and educators are also incorporating novel visual display 
methods, such as VR [OPB* 19] and 3D printing [Mad17], for ed- 
ucation and outreach services. 


Visualizations of evolving astronomical systems have also seen 
advances. Lipsa et al. [LLC* 12] listed only two papers under time- 
dependent astronomy visualization in their survey. In contrast, we 
present a number of research papers with the capabilities of an- 
alyzing halo evolution histories [SXL* 14, PGX* 16, SBD*17] and 
rendering real-time stellar orbits [SJMS19]. Volumetric data can 
now be rendered in 3D with sufficient numerical accuracy to en- 
able a wide range of research in feature detection and extraction 
(e.g., AstroBlend [Nail6], FRELLED [Tay17b] and Houdini for as- 
trophysics [NBC17]), see Sect. 5. 


Quantitative analysis for heterogeneous data types (Sect. 4.1) 
is often supported as a supplement to the visual analysis (e.g., 
Encube [VBF* 16] and TOPCAT [Tay17a]). In order to perform 
analytic tasks effectively, many of these tools utilize visual- 
ization techniques such as linked-views (multi-field visualiza- 
tion [LLC* 12], Glue [Glu12]), detail-on-demand (global/local vi- 
sualization [LLC* 12]), and comparative visualization. 


On a higher level, a few techniques and platforms have been 
developed to provide better visualization workflow in astron- 
omy. The Glue visualization environment described by Goodman 
et al. [GBR18] hosts a variety of shared datasests and open-source 
software. It facilitates flexible data visualization practices, and 
bridges the gap between scientific discovery and communication. 
On a similar note, the EU-funded CROSS DRIVE [Ger14] cre- 
ates “collaborative, distributed virtual workspaces” in order to unite 
the fragmented experts, data, and tools in European space sci- 
ence. Mohammed et al. [MPF20] formalized the scientific visual- 
ization workflow and brought structure to a visualization designer’s 
decision-making process. The paradigm provided by Mohammed 


et al. divides the visualization process into four steps: processing, 
computation of derived geometric and appearance properties, ren- 
dering, and display. In each of these steps, the workflow systemati- 
cally incorporates high-performance computing to efficiently work 
with multi-variate multi-dimensional data. 


However, despite the progress, some of the challenges identified 
a decade ago, such as uncertainty visualization and time-dependent 
visualization, remain largely under-explored today or have great 
potential for improvement. A careful inspection of Table 1 to Ta- 
ble 5 gives rise to a number of useful observations regarding re- 
search gaps for further investigation. In this section, we describe a 
number of challenges and opportunities that we believe are essen- 
tial for the development of visualization in astronomy in the years 
to come. 


8.2. Astronomical Data Volume and Diversity 


A challenge identified by both Hussan et al. [HF11] and Lipsa 
et al. [LLC* 12] is the effective handling of large datasets. Substan- 
tial effort and progress has been made in processing large datasets 
in the past decade in both astronomy and visualization. Luciani 
et al. [LCO* 14] pre-processed large-scale survey data to ensure ef- 
ficient query and smooth interactive visualization. Frelled [Tay 15] 
accelerates visual source extraction to enable the visualization of 
large 3D volumetric datasets. Filtergraph [BSP* 13] supports the 
rapid visualization and analysis of large datasets using scatter plots 
and histograms. Gaia Sky [SJMS19] uses the magnitude-space 
level-of-detail structure to effectively visualize hundreds of mil- 
lions of stars from the Gaia mission with sufficient numerical pre- 
cision. yt [TSO* 10] adopts parallelism to run multiple independent 
analysis units on a single dataset simultaneously. 


Visualizing large datasets remains a challenge for astronomical 
data, especially because of its multi-dimensional property. Visual- 
ization researchers recognize that scalability is an immediate obsta- 
cle that prevents them from introducing many interactive capabil- 
ities [Tay15, PGX* 16, SKW* 11, WAG* 12, SBD* 17]. For analysis 
tasks, the developers of yt identified the challenge of load balancing 
for parallel operations on large simulation data. They added support 
for robust CPU/GPU mixed-mode operation to accelerate numeri- 
cal computation [TSO* 10]. We believe that even more improve- 
ments can be achieved by using network data storage and high- 
performance computing. 


As the volume and diversity of data increase rapidly, connect- 
ing related heterogeneous datasets has become a priority in astron- 
omy. Goodman et al. [GBR18] identified the growing open-source 
and collaborative environment as the future of astronomy. They 
described the Glue [Glu12] visualization environment, a platform 
that hosts a large variety of data and numerous open-source mod- 
ular software. The Glue environment allows users to load multiple 
datasets at once and “glue” the related attributes together from dif- 
ferent data types. Many exploratory astronomy visualization soft- 
ware packages (e.g., OpenSpace, ESASky) are capable of dealing 
with various data types. Some can integrate with Glue, which fur- 
ther improves their integrability and flexibility. 


Nevertheless, most software packages are still striving to ex- 
pand the variety of data formats that they can process. Naiman 
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et al. [NBC17] aimed to use Houdini to render data with non- 
uniform voxel sizes. Baines et al. [BGR* 16] retrieved spectro- 
scopic data and aimed to link it to more of the mission catalogs 
for the next release of ESASky. Burchett et al. [BAO* 19] incor- 
porated data pre-processing as part of the IGM-Vis application to 
allow more data formats as inputs. Vogt et al. [VSDR17] provided 
a unique perspective for simplifying the access to 3D data visu- 
alization by promoting the X3D pathway, as the X3D file format 
lies in the center of various visualization solutions, such as inter- 
active HTML, 3D printing and high-end animations. However, the 
conversion into X3D file format remains the largest obstacle. 


8.3. Interactive Visualization and Intelligent Querying 


Hassan et al. identified “better human-computer-interaction” as one 
of the six grand challenges [HF11], and visualization experts and 
astronomers have joined forces to explore the potential of using in- 
teractive visualization in astronomy research and public outreach. 
We see the overwhelming popularity of interactive visualization in 
the realm of astronomy research. Barnes and Fluke [BF08] demon- 
strated the convenience of embedding interactive visualizations in 
astronomy publications, via 3D PDF and its extension S2PLOT 
programming library. Frelled and AstroBlend leverage the 3D ca- 
pability of Blender to improve the process of analyzing and visu- 
alizing volumetric data [Tay17b, Nail6]. Naiman et al. [NBC17] 
explored the potential use of the graphics tool Houdini in as- 
tronomy research. ESASky, LSSGalpy, SDvision, OpenSpace, and 
Gaia Sky [BGR* 16, AFPR* 17,PCHT17,BAC* 20,SJMS 19] all pro- 
vide visual exploratory capabilities to large-scale astronomy survey 
data, each with their own scientific focuses and distinguishing fea- 
tures. 


Many of these interactive software tools are expanding their im- 
pact in public outreach. A video produced with the visualizations 
from SDvision — titled “Laniakea: Our home supercluster” — gained 
millions of views on YouTube [PCHT17]. The authors are also pur- 
suing the software’s integration with VR technology to further con- 
tribute to education and public outreach services. OpenSpace has 
already demonstrated its success in museums, planetariums, and 
a growing library of publicly accessible video material. The soft- 
ware is built to be easily accessible to the general public via a sim- 
ple installation onto any computer. AR, VR, and 3D printing are 
emerging technologies that are used at a greater scale in educa- 
tional and public outreach services [AJW* 19, Mad17]. In order to 
reach a more artistic audience, Diemer et al. [DF17] have also ex- 
plored integrating art and the physical visualization of astronomical 
objects. 


However, many researchers also recognize the limitations of 
current interactive visualizations and intelligent querying of vol- 
umetric data. Barnes and Fluke [BF08] proposed the capturing of 
mouse clicks on individual elements of a scene to enable 3D selec- 
tion and queries. Goodman advocated for the need of 3D selection 
in astronomy visualization and analysis [Goo12]. Blender allows 
only one side of a transparent spherical mesh to be displayed at a 
time [Tay17b]. The selection of pixels in regions of interest could 
also be a potential problem [Tay17b]. Yu et al. [YEII16] proposed 
several context-aware selections in 3D particle clouds, which help 
to alleviate the issues associated with 3D selection and query. Fe- 


lix tackles the challenge of querying by simultaneously satisfying 
two density ranges, but Shivashankar et al. [SPN* 16] identified in- 
teractive visualization of intricate 3D networks as a “largely unex- 
plored problem of major significance”. WYSIWYG creates a march- 
ing cube of a 2D selection and finds the cluster with the largest 
projection area as the cluster of interest [SXL* 14]. The technique 
lacks flexibility as it depends heavily on the assumption that the 
largest cluster is always of interest. 


8.4. Uncertainty Visualization 


Uncertainty visualization in astronomy remains largely unexplored, 
even though errors and uncertainties are introduced due to data ac- 
quisition, transformation, and visualization. Li et al. [LFLH07] no- 
ticed that uncertainty visualization is seldom available in astronom- 
ical simulations and developed techniques that enhance perception 
and comprehension of uncertainty across a wide range of scales. 
Since then, a few studies have considered errors created during the 
simulation pipeline. GroBschedl et al. [GAM* 18] used uncertainty 
plots to effectively present the distribution of the data and demon- 
strated the confidence in their reconstruction results. With the di- 
rect intention of incorporating uncertainty in the discovery process, 
Bock et al. [BPM* 15] displayed the uncertainty of space weather 
simulation by visualizing an ensemble of simulation results with 
different input parameters (Fig. 13). Combined with a timeline view 
and a volumetric rendering of each ensemble member, scientists are 
able to compare each simulation with measured data, gain an un- 
derstanding of the parameter sensitivities, and detect correlations 
between the parameters. 


Applying uncertainty visualization to 3D astronomy data is chal- 
lenging because we lack the techniques to deal with sparse/far away 
samples and their large error cones. However, the potential exists to 
display uncertainty in a localized object or regions of interest, and 
that potential must be developed further. 


Figure 13: Ensemble selection view that captures the uncertainty 
of all ensemble runs by displaying the full 4D parameter space. 
Image reproduced from Bock et al. [BPM* 15]. 
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8.5. Time Series Data Visualization and Analysis 


Most of the current time series data visualizations are built to dis- 
play halo evolution. One common technique is to use a merger tree 
to visualize the development of halos over time [SXL* 14, AF15, 
PGX*16,SBD* 17]. Other techniques are often used along with the 
merger tree to enhance the effectiveness of the visualization. Shan 
et al. [SXL*14] visualized the changes of selected particles as a 
particle trace path image (see Fig. 14). Preston et al. [PGX* 16] 
added interactivity into their software and facilitated more efficient 
analysis for large, heterogeneous data. Scherzinger et al. [SBD* 17] 
extended their framework by adding particle visualization and anal- 
ysis for individual halos. 


In general, time series data visualization can also be helpful 
when tracking star movements. However, little effort has been ex- 
pended in this regard, with Gaia Sky being the only example, to the 
best of our knowledge. Given the instantaneous proper motion vec- 
tor of stars and simulation time, Gaia Sky [SJMS19] computes a 
representation of proper motions. The software is able to visualize 
real-time star movements with sufficient numerical accuracy. 


Peeceeecocors 


Figure 14: Interactive linked views of halo evolutionary history. 
Top left shows the evolution path of a selected halo and top right 
is the halo projected onto a 2D screen. Bottom is the merger tree 
visualization of the halo evolution. Image reproduced from Shan et 
al. [SXL* 14]. 


8.6. Machine Learning 


During our recent astrophysics workshops, many astronomers 
voiced their desires as well as concerns about using machine learn- 
ing techniques (ML), and in particular, deep learning in the astro- 
physics discovery processes, mostly surrounding the interpretabil- 
ity of “black box" ML models. Active discussions have concerned 
the maturity of ML in astronomy, and a number of surveys have 
been created to assess this maturity [BB10, FJ20, NAB* 19]. 


Combine visualization with machine learning. Although the con- 
cerns regarding the interpretability of ML models in astronomy are 
valid in some cases, we believe that combining visualization with 
ML models has the potential to make the results more accessible to 


classical theoretical interpretation. Indeed, some of the most suc- 
cessful applications of ML in astrophysics involve cases where the 
interpretation is straightforward. 


The use of deep learning techniques in astrophysics has mostly 
been limited to convolutional neural networks (CNNs). Khan 
et al. [KHW* 19] used a combination of deep learning and dimen- 
sionality reduction techniques to help construct galaxy catalogs. 
They extracted the features from one of the last layers of their pre- 
trained CNN and clustered the features using t-SNE. Their method 
not only leads to promising classification results, but also points out 
errors in the galaxy zoo dataset with the misclassified examples. 
Ntampaka et al. [NZE* 19] presented a CNN that estimated galaxy 
cluster masses from the Chandra mock images. They used visual- 
ization to interpret the results of learning and to provide physical 
reasoning. Kim and Brunner [KB16] performed star-galaxy clas- 
sification using CNNs. They studied images of activation maps, 
which help to explain how the model is performing classification 
tasks. Apart from deep learning, Reis et al. [RPB*18] used random 
forests to generate distances between pairs of stars, and then vi- 
sualize such a distance matrix using t-SNE. Their techniques have 
been shown to be useful to identify outliers and to learn complex 
structures with large spectroscopic surveys. 


Many efforts in recent years have focused on interpreting ML 
models [M0120]. We believe a good starting point to obtain inter- 
pretability is to combine visualization with models that are inher- 
ently interpretable [Rud19] (e.g., linear regression, decision tree, 
decision rules, and naive Bayes) in studying astronomical data. Al- 
ternatively, we may train an interpretable model as a surrogate to 
approximate the predictions of a black box model (such as a CNN) 
and integrate such a surrogate in our visualization. 


Topological data analysis. Furthermore, topological data analysis 
(TDA) is an emerging field that promotes topology-based unsuper- 
vised learning techniques. TDA infers insights from the shape of 
the data, and topology has a reasonably long history in its applica- 
tions in scientific visualization [HLH* 16]. A few researchers have 
applied TDA to astrophysics. Novikov et al. [NCD06] were the 
first to propose the method of extracting the skeleton of the cosmic 
web using discrete Morse theory [For02]. Both Sousbie [Soul 1] 
and Shivashankar et al. [SPN* 16] used discrete Morse theory to de- 
velop geometrically intuitive methods that extract features from the 
cosmic web (e.g., filaments, walls, or voids). They demonstrated 
the efficiency and effectiveness of topological techniques in astro- 
nomical tasks. Xua et al. [KXCKGN19] used TDA techniques to 
identify cosmic voids and loops of filaments and assign their sta- 
tistical significance. Not many applications of topology have been 
proposed in de-noising astronomy data, other than the work of 
Rosen et al. [RSM* 19], which uses contour trees in the de-noising 
and visualization of radio astronomy (ALMA) data cubes. 


8.7. Further Advancements in Education and Outreach 


A general ambition in science communication is to shorten the dis- 
tance between research and outreach and make current research re- 
sults and data available at science centers, museums, and in on-line 
repositories. This ambition applies to both shortening the time be- 
tween discovery and dissemination and creating increased public 
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access to research data. Even real-time public participation in sci- 
entific endeavors has been shown to be of public interest [BHY 18]. 
This science communication trend is supported by rapid develop- 
ment of commodity computing platforms capable of handling large 
datasets, availability of open research data, and improved data anal- 
ysis and visualization tools. These trends now enable visitors to 
public venues and home users to become “explorers” of scientific 
data. Astrophysics is one of the prime examples of a domain of 
large public interest and with vast amounts of publicly available 
data. 


Jupiter and irs moons 


Figure 15: Usage of a touch interface in a museum installation 
guiding the user using cues built directly into the data exploration. 


The trend described above poses several challenges. In a pub- 
lic setting, an interactive exploration has to be curated and guided 
in such a way that learning and communication goals are reached 
while not interfering with “freedom” of interaction [YRA* 16]. 
Fig. 15 shows an example of this approach on a touch-capable ta- 
ble used in museum environments for a self-guided learning expe- 
rience, exemplified on a CT scan of a meteoroid originating from 
Mars. Ynnerman et al. [YLT18] coined the term Exploranation to 
describe the introduction of data exploration in traditional explana- 
tory contexts, and Goodman et al. [GBR18] described the need for 
interaction in explanation. This endeavor calls for a new generation 
of authoring and production tools targeting production of interac- 
tive non-linear storytelling [WH07], with interfaces that interop- 
erate with research tools and repositories. We also see that inter- 
active installations will need to feature several different modes of 
operation. A first scenario would be a "walk-up-and-use" situation 
in which the content and the interaction are intuitive. The second 
scenario is a guided experience with a trained facilitator who can 
unlock advanced features of the installation and also bring in more 
data sources for an in-depth science communication session. 


Interaction plays a central role in science communication, and 
public settings put demands on robust, engaging, intuitive, and in- 
teractive visualization interfaces. Sundén ef al. [SBJ* 14] discussed 
aspects of demand and the potential use of multi-modal interfaces. 
Yu et al. [YEII12, YEII16] addressed challenges posed by the in- 
teractive selections of data using touch surfaces. 


In live presentation situations based on interactive software, 
more advanced tools are needed that support the presenter (and the 
pilot). Apart from the authoring tools discussed above, research on 
features such as automatic camera moves during presentations is 


also needed. An interesting challenge in view of advances in ma- 
chine learning and natural language processing is the use of voice 
and gesture based interaction during presentations. Support for the 
embedding of multi-media data sources and other on-line services 
is also needed. 


In outreach, the key role of visual representations cannot be 
underestimated, which calls for systems and tools that generate 
both visually appealing and still scientifically correct represen- 
tations. The challenge here is a trade-off between artistic and 
scientific considerations. From an artistic point of view, Rector 
et al. [RLF* 17] aimed to strike a balance between the scientific and 
aesthetic quality of an astronomical image. They pointed out that 
people have different expectations and misconceptions of colored- 
image-based factors such as cultural variation and expertise. There- 
fore, scientists need to carefully consider the choices they make 
in order to create astronomical images. An example of how this 
challenge is met is the work on cinematic data visualization by 
Cox et al. [CPL* 19, BCWW20]. Another example is the interac- 
tive blackhole visualization [VE21] described in Sect. 3. 


The on-going rapid development of computer hardware creates 
opportunities and challenges as the users expect visual quality on 
the same level as the state-of-the-art games. At the same time, new 
levels of widespread public use are made possible. The challenge 
is to work with visual quality and performance as well as to create 
awareness of limited computer capabilities, data size and complex- 
ity. Another challenge for outreach is the use of social media and 
connected services, which entails not only development of tools 
and availability of data, but also engagement of a large number of 
domain experts with a science communication mission. 


9. Navigation Tool 


Together with the classification and descriptions of the papers in- 
cluded in this survey, we complement our survey with a visual lit- 
erature browser available at https://tdavislab.github.io/ 
astrovis-—survis. The visual browser follows the same classi- 
fication scheme used in this report, where the users can use key- 
word searches to identify potential aspects in the field that are un- 
derserved. Additionally, we provide an alternative navigation tool 
within the visual browser (also illustrated in Fig. 16), where the 
surveyed papers are distributed along two axes. This tool provides 
a different viewpoint for the state-of-the-art survey. 


The first x-axis — single task vs. general purpose — specifies 
whether a specific paper addresses a singular challenge of visual- 
ization in astronomy (single task), or whether it describes a more 
general purpose system that can be applied to a wide array of po- 
tential applications (general purpose). A general purpose system 
also includes software systems that combine datasets of multiple 
modalities in a shared contextualization. The second y-axis — tech- 
nique vs. application — specifies whether a paper develops a spe- 
cific visualization or analysis technique, or whether it combines 
many different techniques to address a specific application. The 
primary category — data analysis tasks — is double-encoded with 
colors and marker shapes. The "other" category represents relevant 
papers mentioned in the survey but that do not belong to any of the 
data analysis tasks. The coordinates of the papers in the navigation 
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tool are based on our best estimation. We lay out the papers in their 
general areas in the figure to avoid overlap of labels. 


10. Conclusions 


In this report, we provide an overview of the state of the art in as- 
trophysics visualization. We have surveyed the literature and found 
that visualization in astrophysics can be categorized broadly into 
five categories based upon the primary user objectives: data wran- 
gling, data exploration, feature identification, object reconstruction, 
and education and outreach. 


A major finding of this work is that there remains a significant 
gap between cutting-edge visualization techniques and astrophys- 
ical datasets. Among the 80+ papers surveyed, around 20 papers 
are from visualization venues. Given the scope of current and fu- 
ture datasets in astrophysics, as well as the advanced methodologies 
and capabilities in visualization research, the potential opportunity 
is great for scientific discovery in bridging this gap. However, this 
bridge will not build itself. 


We therefore take this opportunity to issue a “call to action” for 
both the visualization and astrophysics communities to consider 
more robust and intentional ways of bridging the gap between vi- 
sualization methods and astronomy data. We make the specific rec- 
ommendations below as concrete suggestions for improving this 
goal over the next decade. 


We suggest the construction of a comprehensive AstroVis 
Roadmap for bringing these disparate communities and stakehold- 
ers together at both the grassroots and institutional levels. In order 
to build community, we suggest regular annual joint meetings that 
will explicitly target this gap and bring together visualization and 
astrophysics domain expertise; the 2019 Dagstuhl Seminar on the 
topic of “Astrographics: Interactive Data-Driven Journeys through 
Space” is a good example [GHWY19]. We specifically suggest 
yearly companion meetings to be held alternately at the Winter 
AAS or annual IAU meetings and the IEEE Visualization confer- 
ences. Having explicit joint sponsorship of the professional society 
is an important step in growing this joint community. 


We recognize and appreciate the “grassroots” efforts that 
bring together the visualization and astrophysics communities. 
Indeed, this contribution is the direct result of a Carnegie-SCI 
workshop as well as the IEEE Vis2020 workshop “Visualiza- 
tion in Astrophysics” (http: //www.sci.utah.edu/~beiwang/ 
visastro2020/). Other efforts include the RHytHM (ResearcH 
using yt Highlights Meeting) at the Flatiron Institute in 2020, 
the application spotlight event (https://sites.google.com/ 
view/viscomsospotlight/) at IEEE VIS 2020 discussing op- 
portunities and challenges in cosmology visualization, and the 
annual glue-con hackathon (https: //www.gluesolutions.io/ 
glue-con) that integrate astronomical software projects including 
glue, yt, OpenSpace, WorldWide Telescope into a centralized sys- 
tem. These meetings are critical, in addition to the larger meetings 
we suggest above. 


To assist in the access to information and literature, we suggest 
that an “astrovis” keyword be added to papers — published in ei- 
ther astrophysics journals or visualization publications — to make 


interrogation of papers easy for the communities. Our visual litera- 
ture browser that enables exploration of “astrovis” papers is a step 
in this direction. 


We further suggest starting visualization/data challenges 
within the large publicly available astrophysics data surveys. 
The “solutions” to these challenges should be made publicly 
available and thus applicable to other datasets. A few scien- 
tific visualization (SciVis) challenges have involved astronomy 
datasets at IEEE VIS conferences, notably the SciVis 2015 
Contest using the Dark Sky Simulations (https://darksky. 
slac.stanford.edu/) and the SciVis 2019 Contest using the 
HACC cosmological simulation (https://wordpress.cels. 
anl.gov/2019-scivis-contest/). A recent example is the 
data challenge held at the IEEE VIS 2020 VisAstro workshop, 
where a visualization tool under development called CosmoVis was 
used by Burchett et al. [BAEF20] to interactively analyze cosmo- 
logical simulations generated by IllustrisTNG. In terms of pedagogy, 
we recommend summer schools, hackathons, and workshops that 
help onboard members of each community engage in this joint ef- 
fort. The data challenges may serve as the seeding point for such 
workshops. Science not communicated is science not done. Look- 
ing toward science education and public outreach, we highlight the 
important role played by museums and planetariums in transition- 
ing scientific discovery to public education. We suggest that these 
stakeholders be included organically in the discovery process and 
emphasize their key role in the scientific process. 


We are encouraged by the progress that has been made in the last 
decade, and we look forward to the next decade of development, 
discovery, and education. 
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